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FOREWORD 


Vladimir Aleksandrovich Fock was one of the group of brilliant 
physics theoreticians whose work built the magnificent edifice of 
the quantum theory. 

Forty five years have passed since he wrote the first edition of 
this unique course in quantum mechanics, which was the first of 
its kind in the Soviet Union and one of the first in the world. 

Despite the 45 years, the nature of this book, the organization 
of the material, and the interpretation of specific topics all appear 
more natural today than when the book was first published. This 
is a feature of none but books whose authors have taken an active 
part in creating the subject of research and have understood it 
much more deeply than most of their contemporaries. 

Practically every section of the course has to do with the au¬ 
thor’s own research, much of which has become an essential ele¬ 
ment of the quantum theory. This lends the course the special 
attraction of being basic. 

As the title indicates, the book lays no claim to complete 
coverage of the subject. It gives only the simplest applications in 
addition to the basic concepts of the theory. Many of the more 
complex problems, such as the theories of molecules, atomic 
nuclei, and solids, are not included in the course. 

The changes and additions to be found in the second edition 
are primarily related to the introductory part of the course, 
Chapter I, and concern the philosophical foundations of the quan¬ 
tum theory. Fock considered it of prime importance to formulate 
the basic concepts of quantum mechanics from the proper mate¬ 
rialistic standpoint. The views outlined in Chapter I evolved from 
numerous discussions, some with Niels Bohr. In fact, there is 
evidence that Fock’s criticism of “non-observability in principle” 
prompted Bohr to abandon this idea in his later works. Because 
of the attention paid to the epistemological aspects of the theory 
and their detailed and consistently materialistic interpretation, 
this book differs favourably from other courses. 

The new edition is also augmented by sections devoted to the 
method of a self-consistent field, to the intrinsic symmetry of the 
hydrogen atom, and to other problems. Due to this it more fully 
reflects Fock’s contribution to the quantum theory. 

Finally, some revisions have been made in the section on the 
Dirac equation and the positron theory, which was such a sens a- 
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lion when the book was first written. Fock recognized the great¬ 
ness of the theoretical prediction of the existence of antiparticles 
but always emphasized the incompleteness of the positron theory, 
since it is impossible to give an exact description of the processes 
of creation and annihilation with a single-particle equation. 

One of the remarkable features of Fock’s scientific work is his 
amazing mathematical powers, his ability to solve complicated 
mathematical problems using the simplest and most unexpected 
methods. This quality is especially evident in his scientific papers, 
of course, but in this book too one can easily feel the author’s 
brilliant mathematical individuality combined with a precise and 
simple manner of presentation. 

The preparation of the second edition proved to be the last 
work of V. A. Fock. We feel sure that those who use this book to 
study quantum mechanics will experience the joy of dealing with 
a primary source and feel the spirit of that wonderful time when 
the horizons of human knowledge were immeasurably expanded 
in a matter of years. Fock witnessed and participated in this 
process. 

The author died before the book came out, but he did succeed 
in preparing most of it for press. 

The Institute of Physics at Professor M. G. Veselov 

Leningrad State University Professor Yu. N. Demkov 




PREFACE TO THE SECOND RUSSIAN EDITION 


The second edition of this book, unlike the first, devotes a 
separate chapter to the nonrelativistic theory of the electron spin 
(Pauli’s theory of the electron) and contains a chapter on the 
many-electron problem of quantum mechanics. In addition, some 
of the author’s findings have been incorporated as separate 
sections. Otherwise, the subject matter of the book (both the mathe¬ 
matical theory and its physical interpretation) remains the same, 
except for certain new formulations of an epistemological charac¬ 
ter (the concepts of relativity with respect to the means of obser¬ 
vation and of potential possibility), which has necessitated chang¬ 
ing the expression “the statistical interpretation of quantum me¬ 
chanics” to “the probabilistic interpretation". The new formula¬ 
tions are more precise than the previous ones. 

The title of the book speaks for itself. The word “fundamen¬ 
tals” can be understood as “basic principles” or as “introductory 
facts”. 

We hope that although more than 40 years have passed since 
the book was written, the material in it has not lost its timeliness 
and will be useful to students of quantum mechanics. 

1974 v. A. Fock 



PREFACE TO THE FIRST RUSSIAN EDITION 


This book was conceived as an elaboration of the reports on 
Dirac’s theory of the electron delivered by the author at the 
Leningrad State Optical Institute in early 1929. The original plan 
was, however, expanded to include Dirac’s theory, which is con¬ 
sidered in Part III, the basic concepts of quantum mechanics 
(Part I), and Schrodinger’s theory (Part II). 

From the vast subject of the quantum theory the author has 
chosen material limited in two respects. First, the book considers 
none but the main principles and simplest applications of quantum 
mechanics. It concerns itself exclusively with the one-body prob¬ 
lem. It does not deal with the many-body problem or the Pauli 
exclusion principle, basic to that problem. Second, the author has 
sought to confine himself to that part of the theory that is con¬ 
sidered proved, that is, quantum mechanics proper. He has not 
examined quantum electrodynamics since this theory has yet to 
be fully elaborated. 

The author’s main purpose is to introduce the reader to a new 
set of ideas differing greatly from the classical theory. He has 
endeavoured to avoid using images from the classical theory as 
being inapplicable to quantum physics. Rather, he has attempted 
to familiarize the reader with the basic concepts underlying a 
quantum description of the states of atomic systems. 

As for the presentation of material the author believes that a 
fairly detailed examination of the mathematics of a problem 
facilitates rather than hinders understanding, since it eliminates 
difficulties that the reader may encounter in dealing with the 
mathematical aspects and thus allows attention to be focussed on 
the physics of the problem. 

This book is intended for senior students of physics and math¬ 
ematics and persons with a sufficient preparation in mathematics. 

Leningrad V. A. Fock 

The State Optical Institute 
August 1931 
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Part I 


BASIC CONCEPTS 
OF QUANTUM MECHANICS 


Chapter I 


THE PHYSICAL AND EPISTEMOLOGICAL BASES 
OF QUANTUM MECHANICS 

1. The need for new methods and concepts 
in describing atomic phenomena 

Quantum mechanics appeared during the first few decades of 
the century on the basis of studies of atomic phenomena. The 
structure of the atom, the properties of electrons and atomic 
nuclei, the very stability of a system consisting of a positively 
charged nucleus and negatively charged electrons, the radiation 
of light by atoms and molecules, and, last, the diffraction of 
electrons — all these properties and phenomena require for their 
explanation ideas and physical concepts that differ substantially 
from the ideas and concepts of classical physics. 

A precise formulation of the new concepts demands new math¬ 
ematical tools, and we will familiarize ourselves with these in 
subsequent chapters. But we will try to explain the principal 
difference between quantum mechanics and classical mechanics 
in this introductory chapter. 


2. The classical description of phenomena 

When we describe various phenomena in terms of classical 
physics, we assume that physical processes are independent of 
the conditions of observation. Thus we take it for granted that we 
can always “spy” on the process and yet not interfere with it or 
influence it. True enough, if we “spy” on a physical process from 
different view points (and correspondingly use different frames 
of reference for its description), it will appear to us in different 
ways. For instance, the free fall of a body may proceed in a 
straight line in one frame of reference and in a parabola in 
another. But the dependence of the form of a phenomenon on the 
frame of reference has always been taken into consideration, 
namely, by transforming from the coordinates of one frame of re¬ 
ference to the coordinates of another. Such a change in form 
introduces nothing new into the phenomenon. For this reason in 
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classical physics we can speak of the independence of a pheno¬ 
menon from the manner of observation. 

Quantum mechanics has shown that in microprocesses this is 
not the case. Here the very possibility of observation presupposes 
definite physical conditions that may be related to the essence of 
the process. Specifying these conditions does not mean simply 
indicating a particular frame of reference but requires more de¬ 
tailed elaboration. 

Neglect of these considerations leads to an abstraction that we 
may call the absolutization of physical processes. If we accept 
this abstraction, it becomes possible to consider physical processes 
as occurring by themselves regardless of whether there is a real 
possibility of their observation (that is, whether the appropriate 
physical conditions exist for such processes). 

The use of this abstraction is justified in studying macroscopic 
phenomena, for in these the influence produced by a measurement 
is to all practical purposes negligible. The absolutization of such 
phenomena seemed so natural that before the appearance of 
quantum mechanics it was never specifically stated. It went with¬ 
out saying that physical processes occur by themselves, which 
considerably simplified their description since there was no need 
to specify the conditions of observation. All of classical physics is 
based on the absolutization of physical processes. This abstraction 
is one of its characteristic features. 

Another abstraction permitted in classical physics is the 
possibility of unlimited amendment of observation. By this we 
mean not only an increasingly precise measurement of a specific 
quantity but simultaneously the measurement of any other quan¬ 
tity related to the observed object or phenomenon. This can be 
called the particularization of measurements. Even when measur¬ 
ing different quantities requires different conditions of observation, 
classical physics considers it possible to combine the results 
in an overall picture describing the physical process under 
investigation. There is a logical connection between allow¬ 
ing for the independence of the physical process from the con¬ 
ditions of observation, that is, absolutization of the process, and 
allowing for the possibility of encompassing different aspects 
and characteristics of the behaviour of an object in the physical 
process. 

The concepts of classical physics prompt the idea that not only 
an absolute but an exhaustive description of the state of motion 
of a physical system (with certain degrees of freedom) is possible. 
And an exhaustive description is assumed to be achieved if there 
is full particularization of observations and further observations 
can add nothing new. 
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3. Range of application of the classical way 
of describing phenomena. 

Heisenberg’s and Bohr’s uncertainty relations 

Such fundamental facts as the wave-corpuscular duality of light 
and of particles of matter prove that the classical way of describ¬ 
ing phenomena is unsuitable for micro-objects. At the same time 
we cannot dismiss it completely, since to describe phenomena 
objectively we must rely, directly or indirectly, on something that 
does not require reservations concerning the manner of observa¬ 
tion. And this is the case with the “absolute” manner of descrip¬ 
tion used in classical physics. 

To apply the classical (absolute) manner of description intelli¬ 
gently we must first establish its limits. If we assume that the 
mathematical apparatus of quantum mechanics is known, the 
relations of classical physics derive from it in the form of a 
certain approximation, and the limits of application of the classical 
manner of description prove to be the conditions of applicability 
of this approximation. But in our discourse we proceed from 
classical mechanics and can only use the simplest quantum re¬ 
lationships. 

Let us consider a simple case, the motion of a mass point with 
a mass m. In classical mechanics the state of motion of a mass 
point at any given moment of time is determined by its position 
(x,.y, z) and momentum ( p x , p y , p z ). It would be incorrect, how¬ 
ever, to consider the two sets simultaneously without referring 
to the possibility of their measurement, which is limited by quan¬ 
tum effects. 

As Werner Heisenberg proved, the localization of a particle in 
space demands conditions that are not favourable for measuring 
its momentum, that is, for the localization of the particle in mo¬ 
mentum space. Conversely, conditions that are needed to measure 
the momentum of a particle preclude the possibility of localizing 
the particle in ordinary space. 

Quantum effects, which limit the possibility of measurement, 
manifest themselves, for instance, when light quanta irradiate a 
particle. What is important here is that a photon, which is charac¬ 
terized by wave parameters, is at the same time a bearer of 
definite energy and momentum, which makes it a “particle of 
light”. The wave parameters are: the frequency v (or the angular 
frequency w = 2nv), the wavelength X = c/v (c is the velocity of 
light), and the wave vector k, which determines the direction of 
the wave’s propagation (the absolute value of k is k = 2ji/X = 
= 2jt v/c = co/c). If we define ft as Planck’s constant h divided by 
2n, that is, h = 2nk, the energy of the photon, E , and its 
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momentum, p, will be related to the wave parameters as 

E — tun, p = ftk (p — 2nh/l) (3.1) 

where 

h= 1.054 X 10~ 27 ergs (3.2) 

It follows from Eqs. (3.1) that using light of short wavelengths 
favourable for localizing a particle in ordinary space means using 
high-energy photons that are capable of transferring a great 
impact (momentum) to the particle and thereby upsetting its 
localization in momentum space. Using low-energy photons means 
using light of long wavelengths, and this in turn broadens the 
diffraction bands and reduces the precision of localizing a particle 
in ordinary (coordinate) space. 

Equations (3.1) relate the wave properties of a photon to its 
corpuscular properties. Their right-hand members contain to and k, 
which are determined by the diffraction pattern, and their left- 
hand members, E and p, describe the photon as a particle. Hence 
Eqs. (3.1) reflect the wave-corpuscular duality of a photon. 

The wave-corpuscular duality proves to be a general property 
not only of photons but of all particles. This makes it possible to 
correlate the concepts of the electron as a particle and as a wave. 
The first to suggest the idea of the wave property of matter was 
Louis de Broglie, and proof came later when the diffraction of 
electrons was discovered. A more precise statement of this idea 
is contained in a proper interpretation of the mathematical 
apparatus of quantum mechanics. 

We can express the results of Heisenberg’s reasoning, just 
elaborated, concerning the limits of precision of measurement in 
the form of the following inequalities: 

Ax A p x ^h, Ay Ap y ^ h, AzAp z ^h (3.3) 

in which Ax, Ay, Az characterize the size of the region in coordinate 
space (x, y, z) containing the particle, and A p x , A p y , Ap z the size of 
the region in momentum space ( p x , p y , pz) containing the particle. 
The inequalities show that the very nature of a particle makes it 
impossible to localize it simultaneously in coordinate space and 
in momentum space. They are called the Heisenberg uncertainty 
relations, or simply the uncertainty relations. The word “uncertain¬ 
ty” is understood to mean the regions of localization, (Ax, Ay, Az) 
and {Ap x , Ap y , Ap z ), in the corresponding coordinate and mo¬ 
mentum spaces. 

We can couple the uncertainty relations (3.3) with 

A/A (£' — £)> A (3.4) 

which links the uncertainty in the change of energy of a particle, 
E' — E, with the uncertainty in the time during which this change 
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occurred. According to (3.4), the transfer of energy cannot be 
localized precisely in time. Relation (3.4) can be called the Hei- 
senberg-Bohr uncertainty relation. 

The uncertainty relations (3.3) and (3.4) characterize the range 
of application of the classical (absolute) manner of describing 
phenomena. Since Planck’s constant is small, this manner of 
description can unquestionably be used in referring to macrosco¬ 
pic bodies and their interactions. But this does not exhaust its 
significance. It is important in describing quantum processes 
because it is applied to the instruments used to study atomic 
objects. Experiments (with atomic objects as well) are always 
described in the classical (absolute) way. 

The instruments and other means of observation, including 
the human senses (which, so to say, play the role of instruments 
built into the human body), are the necessary intermediaries 
between the human brain and the atomic object under considera¬ 
tion. We can now define more accurately what is meant by means 
of observation by indicating the manner of their description: 

The means of observation must be described in classical terms 

but with due regard to the uncertainty relations (3.3) and (3.4). 

4. Relativity with respect to the means 
of observation as the basis for the quantum way 
of describing phenomena 

The new, quantum manner of describing phenomena must allow 
for the possibility of actual measurement of the properties of a 
micro-object. We must not ascribe to any object properties and 
states of motion that cannot be justified. For this reason particular 
attention should be given to the way in which we specify proper¬ 
ties and states of motion. We must bear in mind the design and 
operation of the instruments that create the conditions to which 
the object is subjected. As has been said, the instruments and the 
external conditions must be described in the classical manner by 
indicating their parameters. It stands to reason that these pa¬ 
rameters can be defined only to an accuracy permitted by the 
uncertainty relations. Otherwise we will be exceeding the actual 
potential of the measuring instruments. 

A micro-object is revealed in its interaction with an instrument. 
For instance, the path of a charged particle becomes visible in 
the irreversible snowballing process that takes place in a cloud 
chamber or in the emulsion of a photographic plate (the particle 
loses its energy in ionizing the vapour or the chemicals of the 
emulsion; hence, its momentum becomes uncertain). The results 
of the interaction of an atomic object with a measuring instrument 
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(which is described classically) are the main experimental ele¬ 
ments the systematization of which, based on the assumptions 
about the properties of the object, makes up the aim of the theory: 
from a study of such interactions we can deduce the properties of 
the atomic object, and the predictions of the theory are formu¬ 
lated as the expected results of these interactions. 

Such a statement of the problem allows the introduction of 
quantities that characterize the object irrespective of the meas¬ 
uring instrument (electric charge, mass, and properties described 
by quantum mechanical operators) and at the same time makes 
possible a comprehensive approach to the object: the object can 
be viewed from the aspect (wave or corpuscular, for instance) 
necessitated by the instrument and by the external conditions the 
instrument creates. 

The new statement of the problem makes it possible to consider 
the case when the various aspects and properties of an object do 
not manifest themselves simultaneously, that is, when particu¬ 
larization of the object’s behaviour is impossible. This will be so 
if incompatible external conditions are needed for the manifesta¬ 
tion of the object’s properties (for instance, wave and corpuscular). 

We can act on the proposal of Niels Bohr and call comple¬ 
mentary the properties that reveal themselves in their pure form 
only in different experiments held in mutually exclusive conditions, 
whereas in conditions of one and the same experiment they ma¬ 
nifest themselves only in an incomplete, modified form (for 
instance, the incomplete localization in the coordinate and the 
momentum space permitted by the uncertainty relations). There 
is no sense in considering complementary properties simul¬ 
taneously (in the pure form), which explains the absence of a 
contradiction in the concept of wave-corpuscular duality. 

By making the results of the interaction of a micro-object and 
a measuring instrument the basis of the new manner of description 
we introduced an important concept, the concept of relativity with 
respect to the means of observation, which generalizes the well- 
known concept of relativity with respect to the frame of reference. 
Such a manner of description does not at all mean that we are 
ascribing a lesser degree of reality to the micro-object than to 
the measuring instrument or that we are reducing the properties 
of the micro-object to the properties of the instrument. On the 
contrary, a description on the basis of the concept of relativity 
with respect to the means of observation gives a much deeper, 
more refined, and more objective picture of the micro-object than 
was possible on the basis of the idealizations of classical physics. 
Such a picture also requires a more sophisticated mathematical 
apparatus, namely, the theory of linear operators, including eigen¬ 
functions and eigenvalues, the theory of groups, and other math- 
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cmatical concepts. The use of this apparatus in quantum physics 
made it possible to give a theoretical explanation of some funda¬ 
mental properties of matter that could not be explained in the 
classical way and also to calculate the values of many quantities 
observed in experiments (for instance, the frequencies in atomic 
spectra). But more than that — and this is no less important to 
us — the physical interpretation of the mathematical concepts used 
in quantum mechanics leads to a number of profound and prin¬ 
cipled conclusions; for one, generalization of the concept of the 
state of a system on the basis of the concepts of probability and 
potential possibility. 


5. Potential possibility in quantum mechanics 

If we take the act of interaction between an atomic object and 
a measuring instrument as the source of our judgements about 
the object’s properties and if in studying phenomena we allow 
for the concept of relativity with respect to the means of observa¬ 
tion, we are introducing a substantially new element into the 
description of the atomic object and its state and behaviour, that 
is, the idea of probability and thereby the idea of potential possh 
bility. The need to consider the concept of probability as a sub¬ 
stantial element of description rather than a sign of incompleteness 
of our knowledge follows from the fact that for given external 
conditions the result of the object’s interaction with the instrument 
is not, generally speaking, predetermined uniquely but only has 
a certain probability of occurring. With a fixed initial state of the 
object and with given external conditions a series of such interac¬ 
tions results in a statistics that corresponds to a certain pro¬ 
bability distribution. This probability distribution reflects the po¬ 
tential possibilities that exist in the given conditions. 

Let us consider an experiment with a physical system that 
would enable us to make predictions about the results of future 
interactions between the system and measuring instruments of 
various kinds. Such an initial experiment must include a certain 
preparation of the system (for instance, preparation of a mono¬ 
chromatic beam of electrons) and the creation of certain external 
conditions in which the system will be placed after the prepara¬ 
tion (for instance, the passage of the electron beam through a 
crystal). At times it is advisable to consider the preparation of 
the system and the creation of external conditions as two 
different stages of the experiment, but the two stages can also be 
considered one initial experiment, the purpose of which is to obtain 
predictions: 


2 * 
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The initial experiment is always addressed to the future. 

The manner of preparation and the external conditions in an 
initial experiment are described in the language of classical phy¬ 
sics, but its result, which must give a full catalogue of the 
potential possibilities for the given conditions, requires new, 
quantum mechanical means for formulation. To have an idea of 
why we must use these means, let us consider how the potential 
possibilities existing in the given conditions materialize. 

First of all we must bear in mind that a final experiment, in 
which the potential possibilities materialize, may be conducted in 
different ways: the registering instrument may be of different 
construction (as a rule, one excludes another). As in the initial 
experiment, the construction and operation of the instrument are 
described in the classical way. The different versions of the final 
experiment and the corresponding instruments can be character¬ 
ized by the type of the quantity they measure (position, momen¬ 
tum, etc.). 

Thus, with the initial experiment given, there is first of all a 
possibility of choosing different types of instruments for the final 
experiment. In any case, 

The final experiment is always addressed to the past 

(and not to the future in contrast to the initial experiment). It 
can be called the verifying experiment because it enables us to 
verify the predictions of the initial experiment. 

Let us assume that the type of verifying experiment has been 
chosen. How do we formulate its result? We must always remember 
that we are talking about potential possibilities, which are created 
in the initial experiment and realized in the verifying experiment. 
For a given type of verifying experiment these potential possibi¬ 
lities are expressed as probability distributions for the given 
quantity (more precisely, for the values of the quantity that can 
be obtained in the verifying experiment). Hence it is the probabi¬ 
lity distribution we seek to verify. Clearly, this cannot be done by 
a simple measurement but requires many repetitions of the entire 
experiment (with the same preparation of the object and the same 
external conditions). The statistics obtained in this process of 
repetition makes it possible to draw a conclusion about the proba¬ 
bility distribution that is to be studied. 

A total experiment (an experiment that is carried out to the 
end and permits a comparison with theory) consists of the initial 
and verifying experiments combined and performed many times 
over. Here it is in place to note once more that for a given initial 
experiment (for given initial conditions) the final experiment may 
be set up in different ways (the measured quantities may differ) 
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and every type of final experiment has its own probability distri¬ 
bution. 

Thus a theory must describe the initial state of a system in 
such a way as to make it possible to obtain probability distribu¬ 
tions for any type of final experiment from this state. In this way 
we secure a full description of the potential possibilities that 
follow from the initial experiment. 

Since a final experiment may take place later than the initial 
experiment, a theory must also give the time dependence of 
probabilities and potential possibilities. The establishment of this 
dependence will play the same role as the discovery of the laws 
of motion did in classical physics. 



Chapter II 


THE MATHEMATICAL APPARATUS 
OF QUANTUM MECHANICS 


1. Quantum mechanics 

and the linear-operator problems 

An important step towards the creation of present-day quantum 
mechanics was Bohr’s postulation of two principles characterizing 
the properties of atomic systems. 

The first principle asserts that atomic systems have stationary 
states, in which they do not radiate or absorb energy. In these 
states an atomic system possesses energy values that form a 
discrete sequence E u E 2 , .... E n , ... (the energy levels of the 
system). 

According to the second principle, radiation emitted or absor¬ 
bed by an atomic system in the transition from one energy level 
to another has a definite frequency v determined by the con¬ 
dition 

E m — E n = hv 


where h is Planck’s constant, and E m and E n are the energy le¬ 
vels. 

These principles conflict with classical mechanics and electro¬ 
dynamics but are fully confirmed in experiments. It is a natural 
idea, therefore, to propose replacing the classical theory by a 
theory that would harmonize with Bohr’s principles and be lo¬ 
gically consistent. 

The problem of determining the stationary states of an atomic 
system, states that are described by definite energy values (and 
certain other constants of integration), is analogous to the prob¬ 
lem of mathematical physics where definite states of a system 
are chosen from the whole set of states, namely, the problem of 
eigenfrequencies of oscillations, or, more generally, the linear- 
operator problem and the associated eigenvalue problem. In the 
problem of this kind a sequence of values of a given quantity 
would emerge automatically from the whole set of values. Quan¬ 
tum mechanics has substantiated this idea of quantization ever 
since the historic paper of Erwin Schrodinger (1926) concerning 
quantization as an eigenvalue problem. A certain linear ope¬ 
rator is related to each physical quantity, and the theory of 
linear operators is the mathematical apparatus of quantum mech¬ 
anics. 
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2. The operator concept and examples 

As in the case of a function, which is an instruction as to how, 
knowing number x, we can find number y = f(x), an operator 
will map a given function <p(jt) into a new function 

ty(x) = L [cp (x)] (2.1) 

A linear operator has the properties that, for any functions <pi, 
<P 2 . qp. 

L (<Pi + $2) — £ (<Pi) + ^ (92) 

L (a<p) = aL (<p) (2.2) 


where a is an arbitrary complex number. Since we will deal only 
with linear operators, the word “linear” will be often omitted. 

Operators act on functions of one or several variables. The 
variables can be either continuous, which is the case for the 
coordinates (position) of an object, or discontinuous, that is, 
assuming only discrete values, which is the case for energy levels 
or the number that labels these levels. Continuous variables can 
either take on any value or change within certain domains. Dis¬ 
continuous variables can take on both finite and infinite sequences 
of values. We will always assume that the values of the indepen¬ 
dent variables (or arguments) of a function are real numbers, 
whereas the functions themselves, which the operators act on, can 
be complex-valued. When specifying an operator, we must always 
indicate on the functions of what variables it acts. 

Typical operators that act on functions of a continuous variable 
x are the multiplication of a function into x and the differentiation 
with respect to x: 

L[f(x)] = xf(x), L[f(x)] = -jj 7 f(x) 


In the first case x plays a double role: it is the argument of f(x) 
and it is the operator itself. 

Another example is the Laplacian operator V 2 . 


V 2 f(x, y, z) = £L + -^ 


+ 


d*f 
dz 2 


There is also a class of linear operators that can be represented 
by a definite integral: 

L[f(x)]=\K(x,t)f(t)dl (2.3) 

a 


where the function K(x, g) is called the kernel of the operator. 
As one example of a kernel let us consider Poisson’s differential 
equation 

V 2 /* 1 = / 
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If f(x, y, z) is specified in an unbounded region and if the asymp¬ 
totic behaviour of the solution at infinity is F — 0, the solution 
of Poisson’s equation is 

F{x, y, z)=G[f(x, y, z)\ = - ± \ \ \ dl di\ rfg 

The operator G has a kernel 

i _!_ 

4jv 4ji [(* — |) 2 + (y — ti) 2 + (z — S) s l' /j 

If the equation 

L(F) = f (2.4) 

and the appropriate asymptotic conditions yield 

L~'(f) = F (2.5) 

then it is said that the operator L has an inverse (inverse ope¬ 
rator) L~ l . In our example G is the inverse of V 2 . 

If a variable takes on only discrete values, these values can 
always be labelled by positive integers. Hence, a function of a 
discontinuous variable can always be replaced by a function of a 
positive integer, the number labelling the value. Any operator 
that acts on the function f n of a positive integer n (more exactly, 
the result of this action) can be represented in the form of a sum 

Lfn^'LKnmfm (2.6) 

m 

The totality of the Knm' s is called the matrix of the operator, 
and the operator is said to be in the matrix representation. For¬ 
mula (2.6) is in perfect agreement with formula (2.3), and the 
matrix (Knm) is the kernel of this operator. 

3. Hermitian conjugate. Hermiticity 

To every linear operator L we can relate another operator L+ 
such that satisfies a certain functional equation and is called the 
hermitian conjugate of L. 1 We will denote it by the same letter 
as the original operator but with a dagger as a superscript (L + 
is pronounced el-dagger). 

We can define a hermitian conjugate operator as follows. Given 
two functions f and g that satisfy some general conditions but in 
all other respects are arbitrary. The functional equation that de¬ 
termines a hermitian conjugate operator L + is 

\[gL(f)~L + (g)f]dx = 0 (3.1) 

1 In mathematics the terms adjoint, conjugate, and associate operator are 
used. 
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if all independent variables are continuous. It is 

S [g n L (f m ) - TUJmJ = 0 (3.1*) 

if they are discontinuous. The integration in (3.1) and the summa¬ 
tion in (3.1*) are over all values of the independent variables in 
a given range. The symbol dx in (3.1) denotes the volume element 
in this range. By a bar we denote the complex conjugate. Now 
let us turn to the general conditions that f and g must satisfy. 
First, the sums and integrals in which f and g appear must be 
convergent. Second, f and g must satisfy certain boundary condi¬ 
tions, which, generally speaking, depend on the type of L. 

If L+ coincides with L, then L is called a hermitian operator. 

When the independent variable is discontinuous, we can 
write L in the form (2.6), and formula (3.1*) then yields 

(3-2) 

mn 


For this equation to hold for arbitrary f and g, the coefficient 
of every product g n f m must vanish. If we equate the complex 
conjugate of (3.2) with zero, we find the matrix elements of a 
hermitian conjugate operator: 


Kmn = Kr 


(3.3) 


An operator is hermitian if and only if its matrix elements sa¬ 
tisfy the condition 


Amn An 


(3.3*) 


Such a matrix is called hermitian ( self-adjoint , self-conjugate). 
If we arrange its elements in a rectangular array in such a way 
that, say, A;/ stands in the tth row and in the jth column, the 
elements in the principal diagonal (that is, with the same sub¬ 
scripts) are real, and any two matrix elements that are in a posi¬ 
tion symmetric with respect to the principal diagonal are complex 
conjugate quantities. 

Let us now consider the case of one continuous variable and 
assume that an operator L has a kernel K(x, £). We denote the 
kernel of the hermitian conjugate operator by K + (x, g) and use 
formula (2.3) to obtain condition (3.1) in the form 

S \ [A (x, D — A + (|, xj] g (x) f (£) dxdl = 0 (3.4) 


which holds true for arbitrary functions f and g if and only if the 
expression in brackets vanishes. This yields the following expres¬ 
sion for the kernel of the hermitian conjugate in terms of the 
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kernel of the original operator: 

K + (x,t) = K&,~) (3.5) 

Thus, knowing the kernel of an operator, we can find the kernel 
of the hermitian conjugate by interchanging the independent va¬ 
riables of the original kernel and then forming the complex con¬ 
jugate. The condition for hermiticity is 

K(x,t) = K&x) (3.5*) 

In a similar manner we can find the kernel of a hermitian con¬ 
jugate when there are several independent variables. Let us illu¬ 
strate this with the help of the operator G of the previous section. 
The kernel of G, where G is the inverse of the Laplacian opera¬ 
tor V 2 is invariant under the interchange of x, y, z and g, tj, g. 
Since the kernel is also a real function, G is hermitian. 

If the inverse operator is hermitian, the original is hermitian 
too. Notably, the Laplacian operator is hermitian. This can be 
proved directly using the general formula (3.1). In fact, using 
vector notation, we find that 

g • V 2 / — V 2 g • / = div [g grad f — (grad g) /] (3.6) 


If, in addition, / and g vanish at infinity, then 
^ [g ' V 2 f — V 2 g • /] dx = jj div [g grad f — (grad g) f]dx= 0 (3.7) 

according to Gauss’s integral theorem. 

Let us consider another example. We set 

y-Kr 0.8) 

and go on to find the hermitian conjugate of L. According to (3.1), 

S [s If - (~ If) f ] dx = \ ih M dx = 0 M 

a a 

if f and g vanish at the limits of integration. This yields 

L*i = -jL (3.,0) 

We see that L is not hermitian. But if we multiply it by a pure 
imaginary number, say — i, the new operator L\ yields 

= (3.11) 


which means that L\ is hermitian. 
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4. Operator and matrix multiplication 

The product of two operators K and L is an operator that con¬ 
sists in applying K and L successively. If we apply L first and K 
second, the product is written as 

M = KL 

But if we apply K first and L second, the product will be 

N — LK 

Generally speaking, M and N are different operators, which implies 
that the product of operators depends on the order of multiplica¬ 
tion. For instance, if K denotes multiplication by x and L differ¬ 
entiation with respect to x, 

Kf — X'f, Lf—jL (4.1) 

we find that the product M acts on f as 

Mf = KLf — x~ (4.2) 

whereas 

Nf = LKf = -£ r (xf) = x-§t + f (4.3) 

We see that in our example KL LK and that 

(LK-KL)f = -? 7 (xf)-x-§L = f 

Hence the difference LK — KL is an identity operator: 

LK-KL = l (4.4) 

In some cases, however, the product does not depend on the 
order of multiplication. Then operator multiplication becomes 
commutative and the operators are said to commute. For example, 
differentiation with respect to two different independent variables 
of a function is commutative, and the operators corresponding to 
the differentiations will commute. 

Let us find the hermitian conjugate of 

M = KL (4.5) 

We have 

\gMf dx=\gK (Lf)dx 

Let us denote 

Lf=r 

The previous expression will then be 

\gKf' d T = \(K + g)f'dx 
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according to the definition of K + . We then assume that 


K + g = g' 


which yields 


J (K + g) Y dx = J g'Lf dx = J (L + g') f dx 


according to the definition of L + . Substituting K + g for g', we 
finally obtain 

\gKLfdx=\(L + K + g)fdx (4.6) 

If we compare (4.6) with the definition of a hermitian conjugate 
operator 

jj gMf dx=\ ( M + g ) f dx 


we find that 


M + *=L + K + 

(4.7) 

(KL) + = L + K + 

(4.8) 


Thus a hermitian conjugate of a product of operators is equal 
to the product of the hermitian conjugates in a reversed order. 

If K and L are hermitian, their products are not, generally 
speaking, hermitian because 

(KL) + = LK ¥* KL 

But if K and L are hermitian and commute, their products are 
hermitian. 

Let us consider the product of two operators with kernels 
K(x, l) and L(x, £). We have 

Lf=\ L(x, l)f(l)dl, KLf = J [$ K (x, £,) l (It, I) dgi] f (I) d\ 

We first integrate with respect to £i and then, introducing the 
notation 

KL(x, l)=\K(x, h)L(l u £)d£, (4.9) 

we find that the previous formula can be written as 
KLf=\KL(x, l)f(l)dl 

Thus the product KL has a kernel defined by (4.9). 


( 4 . 10 ) 
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We now go on to the case when the two operators in a product 
act on a function of a discontinuous variable. As we know, the 
operators can then be represented by matrices. We have 

W)»=EU, (Kg)n=ZKn,g, (4.11) 

If we assume that 

g = (4.12) 

then 

W) n = Z ZK nl L lt f, 
or 

(KLF) n = Z(KL) nl f l (4.13) 

i 

where 

(KL) ni = ZK nl L lt (4.14) 

Equation (4.14) defines matrix multiplication. Thus ( KL) n i is 

the inner product of the nth row of K and the tth column of L. 

Let us consider a matrix U whose elements Uy satisfy the 
conditions 

ZutiU,^ b ik , ^U t ,Ut k = 6 lk (4.15) 

where, according to the notation in (3.3), 

Uti^Ujt (4.16) 

and the Kronecker delta, 6i*, is 1 when i — k and zero otherwise. 

If we use matrix multiplication, we can write (4.15) as 

U + U = 1, UU + = 1 (4.17) 


where the “1” stands for the identity matrix. A matrix that sa¬ 
tisfies Eqs. (4.17) is called unitary, and the corresponding opera¬ 
tor is said to be unitary. A unitary operator has the following 
property. If 

g = Uf (4.18) 


or in matrix form 


s„ = Z UJ, 




then 


(4.18*) 
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But according to (4.15), we have 

'Lgngn=Y,UU (4.19) 

n l 

which means that a sum of type (4.19) is invariant under a uni¬ 
tary transformation. 


5. Eigenvalues and eigenfunctions 

The main problem in the theory of operators is the analysis of 
the equation 

Lf = Xf (5.1) 

where X is a constant quantity. The operator L is assumed to be 
a normal operator, that is, an operator that satisfies the condi¬ 
tion 

LL + = L + L (5.2) 

Hermitian and unitary operators are normal. 

A linear equation that is satisfied by assuming the unknown 
function to be zero is called a homogeneous linear equation, which 
is the case with Eq. (5.1). When analyzing such an equation, we 
must consider the boundary conditions as well. These conditions 
are also homogeneous, that is, such that f = 0 is a boundary 
condition. A homogeneous linear equation along with linear ho¬ 
mogeneous boundary conditions comprise the homogeneous pro¬ 
blem. 

For arbitrary values of the parameter X the homogeneous pro¬ 
blem has, generally speaking, none but the trivial solution f — 0. 
Only when the parameter takes on very special values is a non¬ 
trivial solution possible. These values can form a denumerable or 
countably infinite set Ao, Xi, X 2 ,... or can assume all values in a 
certain interval. The special values of X are called eigenvalues, 
and the associated solutions of the homogeneous problem are 
called eigenfunctions. In mathematics one can come across cha¬ 
racteristic values (functions) or proper values (functions), but 
not in physics. The complete set of eigenvalues constitutes the 
spectrum of the operator. A denumerable set is said to form a 
discrete spectrum, and when the eigenvalues assume all values 
in a certain interval, the spectrum is called continuous. In more 
complicated situations the spectrum may exhibit both discrete and 
continuous parts. 

Eigenfunctions corresponding to a discrete spectrum have the 
property that 

YJf and \^ dx 



Basic Concepts 


31 


over all values of the independent variables converge, whereas for 
the eigenfunctions of a continuous spectrum both the sum and the 
integral become infinite. In the latter case instead of considering 
the eigenfunctions themselves, we can integrate them with respect 
to A over an infinitesimal interval of the continuous spectrum. If 
we then substitute these integrals for the functions in the above 
products, the sum and integral are finite. 

For the eigenvalues of a hermitian operator we can prove the 
following theorem: 

The eigenvalues of a hermitian operator are real. 


Let / be a solution of (5.1). We multiply both sides of Eq. (5.1) 
by f and sum (or integrate) over all values of the independent 
variables and get 



M 

r- 

T 

>> 

M 
—►> 


or 

which yields 

^ fLf dx — X^ffdx 


A Z JLI 

A- Lff 

(5.3) 

or 

\ fLf dx 



— 

(5.3*) 


\ff dx 


The denominators in the last two expressions are real positive 
numbers. We will next show that the numerators are also real. 
In fact, the imaginary parts of the numerators are 

i£[fL/-(Z/)/l or ±-^[fLf-(Lf)f]dx 

These two vanish by (3.1). Thus the numerators in (5.3) and 
(5.3*) are real, which means that A is real as well. The proof is 
complete. 

In some cases formulas (5.3) and (5.3*) indicate the sign of 
the eigenvalues. For the Laplacian operator, (5.3*) yields 

^ fV 8 / dx ^ (grad f • grad /) dx 

\ffdx \ffdx 

Both integrals in the right-hand side are positive. Hence, the 
minus before the fraction indicates that the eigenvalues of the 
Laplacian operator are negative. 
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There will be many problems in finding eigenvalues and eigen¬ 
functions throughout this book, so that there is no need to examine 
them further at this point. 

6. The Stieltjes integral and the operator 
corresponding to multiplication 
into the independent variable 

Let us first recall the concept of the definite ( Riemann ) integral. 
Suppose that the closed interval defined by the limits of integra¬ 
tion Ao and A is divided into n subintervals by the numbers 
Ai, Aj, . .. , An— i- (Here An = Ai) If we then increase the number of 
these subintervals indefinitely (n-*~ oo) so that even the largest 
tends to zero, the Riemann integral of /(A) over the closed interval 
[Ao, A] is defined as 

A, n 

\/(A)dA= lim V^A^AA, (6.1) 

d n -*°°£1 

where AA< = A; — A/_i. 

We now go on to the Stieltjes integral. Let p(A) be either a mo- 
notonically increasing function or the difference between two 
monotonic functions. We construct the sum 

E/(A < )Ap(A < ) 

where Ap(A,) = p(A*) — p(A/_i). When the process of subdivision 
of [Ao, A] is indefinite, the Stieltjes integral is the limit of this 
sum: 

A. n 

\ f (*) dp (A) = lim V / (A,) Ap (A,) (6.2) 

Whenever p(A) is continuous and has a bounded first derivative, 
we can set 

Ap (A() = p' (X { ) AA i 

which holds true since terms in (AAi) 2 and higher powers vanish 
when passing to the limit. Then 

\ \ 

\f(l)dp(X)= $f(A)p'(A)dA (6.3) 

A»o A»a 

and (6.2) includes the Riemann integral (6.1) as a special case. 
But the Stieltjes integral has a meaning even when p(A) is a 
discontinuous function, that is, possesses finite discontinuities. 
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This is the case with the operator corresponding to multiplication 
into the independent variable if the variable is continuous. 

Let us take a close look at such an operator. The corresponding 
eigenvalue equation is 

xf(x, X) = Xfix, X) (6.4) 

Now this equation cannot be satisfied by any “normal” type of 
function, since we would have to assume that f{x, X) is zero for 
all values of x except x — X. Because of this let us formally 
integrale (6,4) with respect to X: 


x x 


x \ f{x, X)dX= \a,f(x, X)dX 

X, X, 

We denote 

X 

(6.5) 

\ f(x, X)dX = Fix, X) 

Xa 

(6.6) 

/v 0 

and write (6.5) as 

1 % 


A» A 

x Jd^(x, *)= \xdjix, X) 

(6.7) 


where the integral on the right is the Stieltjes integral. Equation 
(6.7) is solved by a function 

F (x, X) = 1, X>x 

F(x,X) = 0, X<x (6.8) 


Indeed, all increments 

AF(x, Xi) = F(x, X t+ i) — F(x, X[) 

except one are zero. The increment that is not zero is equal to 
unity and corresponds to the values of Xi and Xi+i that satisfy 
the inequality 

X t < x < A,/ +I 

When [Xo, A.] is subdivided indefinitely, this particular Xi tends 
to x, and in the limit Eq. (6.7) holds true. 

Thus our operator has no eigenfunctions in the conventional 
sense, but a function F(x, X) defined by (6.6) does exist. 

If the independent variable x is discontinuous ar d assumes a 
sequence of discrete values 

^ 1 > %2t • • •» %n> • • • 


3—2186 
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then the eigenvalue equation (6.4) must be written in the form 
of a set of equations 

Xnf (x n , X) = Xf (x n , X), n= 1,2,... (6.9) 

Equations (6.9) have a solution in the ordinary sense. Let 
f(x„,X)= 1, X = x n 

f(x n ,X) = 0, X^x n (6.10) 

This is obviously a solution to (6.9). The eigenvalues are 

K = x n ( 6 . 11 ) 

and the corresponding eigenfunctions are 

f(x n , X m ) = b nm (6.10*) 

7. Orthogonality of eigenfunctions and normalization 

Let us examine a hermitian operator with a discrete spectrum. 
We write down the eigenvalue equations for f n and f m correspond¬ 
ing to two different eigenvalues X n and X m ' 

Lf m = X m f m , Lf n = Xj n (7.1) 

For hermitian operators we have 

\[f n Lf m -(LL)f m ]dx = 0 (7.2) 

If we use (7.1) in (7.2), we obtain 

(X m -K)\fnf m dx = 0 (7.3) 

Since by definition X m ¥= X n , 

5 fnfm dx = 0, n=£ m (7.4) 

This property of eigenfunctions is called orthogonality. Thus 

Eigenfunctions corresponding to different eigenvalues of a her¬ 
mitian operator are mutually orthogonal. 

Since an eigenfunction satisfies a homogeneous equation, if it 
is multiplied by a factor, the product will also satisfy this equa¬ 
tion. The factor can be chosen so that 

\fnfndT= 1 (7.5) 

Making this choice is called normalization, and functions that 

satisfy (7.5) are spoken of as normalized. The normalization is 

not exactly definite because if we substitute f' n = e i<ln f n for f n , 
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with a n real, f n will be replaced by f' n =e lan f n and condition 
(7.5) will again hold. 

The orthogonality condition and the normalization of functions 
can be written compactly as 

\fnf m dx = t> nm (7.6) 

An eigenvalue can relate to one or several eigenfunctions, in 
fact, to as many as the equation 

Lf = Kf 

has linearly independent solutions for the given eigenvalue K n . 
When there are several eigenfunctions corresponding to one eigen¬ 
value, we have degeneracy. 

The number of solutions (denoted by s) may depend on n. If 
the solutions are 

/nl > fn2t • • • > fns (7.7) 

then any superposition 

fn — &lfnl + (t 2 fn 2 + ... + d s fns (7.8) 

will also be a solution. The functions (7.7) are not necessarily 
orthogonal. But we can always replace them with superpositions 
of type (7.8) that are orthogonal and normalized. Let us assume 
that this has been done. Then f n i and f nk will for n — m satisfy 
the condition 

\fmfnkdx = b lk , l, k=l, 2, .... s (7.9) 

At times it is convenient to denote all functions (7.7) by one 

symbol f n . Then (7.9) can be written as (7.5), and its two sides 

are understood to be matrices with elements (7.9). 

Condition (7.9) does not fully determine the function f„i. In 
fact, if we set 

fnk ~ a klfnl (7-10) 

with a k i complying with the condition 

S 

2j dki^n — bki (7.11) 

then (7.9) will again hold. Matrix A with elements am that satisfy 
(7.11) is said to be unitary. So is the transformation that is 

carried out by means of this matrix. We can thus say that the 

eigenfunctions corresponding to a degenerate eigenvalue can be 
determined up to a unitary transformation. 
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Let us now turn to the case of an operator with a continuous 
spectrum. The eigenvalue equation is 

Lf(x,k) = kf(x,k) (7.12) 


We integrate both sides with respect to X twice: first over the 
interval [Xi, Xi + AiX], then over the interval [X 2 , X 2 + A 2 X]. If we 
assume, as was done in considering the Stieltjes integral, that 


we find that 


and 


a, 

F(x, X) = \ f{x, X)dX 
^0 

Xi +AiAi 

L[A,F(jc, X)] = J Xd>.F(x, X) 
x, 

A | A, 

L[A 2 F(x, X)] = 5 X d^F (x, X) 

A.2 


(7.13) 


(7.14) 

(7.14*) 


where for the sake of brevity we have set 

A k F (x, X) = F (x, X* + A ft X) - F (x, X*), *=1,2 (7.15) 

A quantity defined by (7.15) is called a proper differential. Proper 
differentials, with respect to x, are square-integrable, whereas the 
functions f(x, X) are not._ 

We multiply (7.14) by A 2 F and the complex conjugate of (7.14*) 
by Ai F, subtract one from the other, and integrate the result. The 
left-hand side nullifies, which means that the right-hand side is 
also zero, that is, 

dx 5 $ (X-^^fUTTO^U. *) = 0 (7.16) 

The last equation is valid for any Xi, X 2 , AiX, and A 2 X. Let us 
assume that the intervals AiX and A 2 X are separated by a finite 
distance and that both are infinitesimal quantities. This means 
that up to infinitesimals the difference X—p is equal to Xi—X 2 . 
Since the latter is nonzero, we can cancel it out and obtain 

^A^FA,Fdx = 0 (7.17) 

Thus 

Proper differentials corresponding to different intervals are 
mutually orthogonal. 
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We now assume that A\X and A 2 A, coincide, and we consider the 
integral 

J=^AFAFdx (7.18) 

We chose arbitrarily two eigenvalues A.i and X 2 such that the inter¬ 
val AX lies between them: 

X\ <C X ^ X -}- AX X 2 

By the orthogonality condition for proper differentials the inte¬ 
gral / will not change if we add to it 

\ Af(\f(x, \ A/r f \ f(x,X)dx']dx 

V x, / \x+aa / 

Hence, 

J =^AF[F (x, X 2 ) - F (x, A,)] dx (7.19) 

which implies that / is an infinitesimal of the first order relative 
to AX, rather than of the second order as one would expect it to 
be. The integral (7.18) can be normalized so that 

lim ~ (|AF| 2 dT=l (7.20) 

AA.->0 J 

This is the usual normalization conditions for the “eigenfunctions” 
of an operator with a continuous spectrum. For instance, the 
function F(x,X) of Section 6 can be used for normalization. 

8. Expansion in eigenfunctions. 

Completeness property of eigenfunctions 

We will consider the eigenfunctions 

Ux) = f(x,X n ) (8.1) 

of an operator with a discrete spectrum and assume them to be 
normalized. Let f(x) be a square-integrable arbitrary function. 
We will try to expand it in a series of functions f n (x). For this 
we set 

/(*)= 2,akfk(x) +R n (x) (8.2) 

*-0 

The sum of the products on the right side constitutes the first n 
terms of the expansion, and R n (x) denotes the remainder terms 
after the nth term. The expansion coefficients a* are chosen in 
such , a way as to insure the smallest possible error, where for 
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the measure of error we take the integral 


which is 


Pn = \\Rn(x)fdx 


Pn■■ 


n 

=5 a ktk (x) 


k-0 


2 

dx 


(8.3) 

(8.4) 


This integral is a quadratic function in the n unknowns a*. We 
will look for the minimum of (8.4), and for this will equate with 
zero the first partial derivatives with respect to the expansion 
coefficients. By the orthogonality condition for eigenfunctions we 
find that 


n 

P« = J I f (x) I 2 dx — £ a k J fix) f k ( x)dx 


fe -0 


n n 

— £ d k \f(x) h(x) dx + £ a k a k 

k-0 


(8.5) 


Obviously we can differentiate with respect to a* and a* as if 
these were independent quantities. Equating the derivative with 
respect to a* with .zero, we obtain the following formula for a*: 

— § IkJx) f (x) dx (8.6) 

The use of (8.6) in the expression (8.5) for the mean square 
error p„ yields 

n 

P n = \ \f Ml 2 dx — Yj I a k I 2 (8.7) 

k-0 

Since by definition p„ cannot be negative, for any n there is the 
inequality 

n 

£ \a k ?^\\f(x)\*dx ( 8 . 8 ) 


If for any square-integrable function f(x) 

lim p„ = 0 (8.9) 

n-> oo 

or, which is the same, 

\a k ?=\\f(x)?dx (8.10) 

*■= o 
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the set of functions f*(x) is said to be complete, which means 
that no function f(x) can be found that would be orthogonal to 
all functions f*(x). Indeed, for such a function all expansion 

coefficients are zero, which implies that ^|/(x)| 2 dx is also zero. 

But this is possible only if f(x) is zero (with the possible excep¬ 
tion of certain values of x). 

Equation (8.9) shows that in the limit the remainder R n (x) 
vanishes (with the possible exception of certain values of x), so 
we have here a series expansion 

f(x)=taj n (x) (8.11) 

n-0 

If a second function g(x) can be expanded in a series 

g(x)=£ b n f n (x) (8.12) 

n -0 

then we have 

oo 

\ fix) g {x) dx = Yj a n b n ( 8 - 13 ) 

= 0 

which is a generalization of (8.10). 

There is a theorem in mathematics for operators of a very ge¬ 
neral form, which states that 

The totality of eigenfunctions makes up a complete set of 
functions. 

If there is a degeneracy, in the expansion 


oo 


f(x)=Z aJn (x) 

n** 0 

every term 

(8.11) 

a n fn(x) 

(8.14) 

must be replaced by a sum 


a nlfnl ( x ) + a n2 fn1 (x) + ... + a ns f^ (x) 

(8.14*) 

where 


a n t = ^ fm (x) f ( x ) dx 

(8.15) 

In the case of a continuous spectrum the proper 
divided by (A ’k)' l \ 

differentials 
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form a mutually orthogonal and normalized set of functions. 
Hence we can write the expansion in the form 

/W = £a(*)-^ AF(x,X) (8.17) 

where 

a(A) = 7 Z^S A/?( *’ X)f(x)dx (8.18) 

or, as AX -*> 0, 

f(x)=\c(X)d x F(x,X) (8.19) 

where 

c (X) = lim ~ \ A F (x, X) f (x) dx (8.20) 

AX->0 aA J 

At times these formulas can be replaced by simpler ones: 

f(x)=\c (X) f (x, X) dX, c (X) = J f(£~X) f (x) dx (8.21) 

These represent an integral expansion similar to Fourier integral 
expansion, which is a particular case of (8.21). 

The completeness property of eigenfunctions for an operator 
with a continuous spectrum is 

\\f(x)?dx = \\c(X)?dk ( 8 . 22 ) 

or, if b(X) is an expansion coefficient of another function g(x) 
(just as c(X) is for f (x)), 

\mg(x)dx=\7(X)b(X)dX (8.23) 

If an operator has a continuous spectrum as well as a discrete 
spectrum, the expansion in eigenfunctions and the completeness 
condition will each include both a sum and an integral. 



Chapter III 


QUANTUM MECHANICAL OPERATORS 


1. Interpretation of the eigenvalues of an operator 

At the beginning of the previous chapter we found that in 
quantum mechanics a definite linear operator is related to each 
physical quantity. What is the meaning of such a relationship? 
We saw that an operator possesses certain eigenvalues and eigen¬ 
functions. Now we must clarify the physical meaning of these 
mathematical concepts. Let us start by interpreting the eigen¬ 
values, they being a simpler concept. We will introduce the follow¬ 
ing hypothesis: 

The eigenvalues of an operator related to a given physical 

quantity are the values that this quantity assumes in the con¬ 
ditions created by measuring it. 

We must note the importance of specifying the conditions in 
which the quantity assumes its values. If we measure a quantity 
that is not included in the same group with the original quantity 
(see Chapter I), new conditions are created. In these the original 
quantity may not have definite values. However, when measuring 
a quantity, we create conditions in which one of the eigenvalues 
of the corresponding operator must appear as the result of the 
measuring process. To state our interpretation more briefly, we 
can say: The eigenvalues of an operator are the values of the 
corresponding physical quantity. 

Hence there is a limitation on the form of the operator corre¬ 
sponding to a real physical quantity. Since all the values of this 
quantity are real, the operator must have only real eigenvalues, 
which points to its hermiticity. Thus 

A real physical quantity is described by a hermitian operator. 

As we know, an operator can have both discrete and continuous 
spectra. Therefore operators can correspond to quantities that 
assume both a denumerable set of values and a set of all the 
values in a certain interval. We must note here that the old quan¬ 
tum theory could formulate “quantum conditions” only for quan¬ 
tities that change abruptly and did not include the cases when 
quantities change continuously. 
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2. Poisson brackets 

How do we find the operator for a given physical quantity? We 
may use two guiding principles. First, the eigenvalue spectrum of 
the operator must coincide with the spectrum of observable values 
of the physical quantity. Second, the relations between operators 
must correctly reflect relations between the quantities. In relating 
operators to physical quantities we must note that the analogy 
with classical mechanics plays an important role. But this clas¬ 
sical analogy must be used cautiously since it may be incomplete. 

In classical mechanics a system is described by canonical 
variables: the generalized coordinates q\, q 2 , ..., q n and the gen¬ 
eralized momenta pu pz, ..., p n ■ We can define canonically con¬ 
jugate variables with the help of the Poisson bracket. We start 
with Hamilton’s canonical equations of motion 


dq k dH dp k dH u , „ 

dt dp k ’ dt dq k ’ R 

Let F be a certain function of coordinates, momenta, 
F — F (<7i> < 72 , • • •, q n , Pu P 2 , • • •, P„> 0 

We construct the total time-derivative of F 

, n (2.1) 

and time: 

(2.2) 

dF _ dF A r dF dq k dF dp k \ 

dt dt ""r" 2-i dq k dt * dp k dt ) 

(2.3) 

Using Eqs. (2.1) in (2.3), we obtain 

where we call 

(2.4) 

[H.F] -fid" f f f 1 

Avtyk dc >k dc *k d p k ) 

(2.5) 


the classical Poisson bracket of the functions H and F. Likewise, 
for any pair of functions F and G the Poisson bracket is 


[A G] 




/ dF 

dG 


dG 

\ d P k 

dq k 

. 

~ d< !k 

d Pk 


( 2 . 6 ) 


The main property of the classical Poisson bracket is its invari¬ 
ance under a contact transformation, that is, a transformation 
of the variables pk and < 7 * that leaves the form of Hamilton’s 
equations unaltered. Furthermore, the Poisson bracket has the 
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following properties (which are easily derived from its definition): 

[F,G]=-[G,F] (2.7) 

[F, c]=0 (2.8) 

with c being a constant not depending on pk or q *. 

Also 

[^i + F 2 , G] = [Fi, G] -f [F 2 , G] (2.9) 

[Fft, G] = F x [F 2 , G] + [F u G] F 2 (2.10) 

Finally, we have the identity 

IF, [G, L] ] + [G, [L, F]) + [L, [F, G]] = 0 (2.11) 

The Poisson brackets for generalized coordinates and momenta 
are 

<7j] = °> [Pk, Pt] = 0. [pk, Qi\ = hi (2.12) 

In classical physics (2.12) can serve as a definition of canon¬ 
ically conjugate coordinates and momenta. 

As we have already noted, all relationships that use the Poisson 
bracket (for instance, the formula for the total time-derivative of 
a function) do not depend on the choice of the generalized coor¬ 
dinates and momenta. We can thus expect that because there is 
an analogy between classical and quantum mechanics, there 
should be something similar to the Poisson bracket in quantum 
mechanics. 

The form of the quantum Poisson bracket was found by 
P. A. M. Dirac on the basis of Bohr’s correspondence principle 
with the classical formula (2.6) as the starting point. Our 
approach will be different. Essentially it belongs to Dirac and is 
based on the assumption that the quantum Poisson bracket of any 
two noncommutative operators possesses all the properties (2.7)- 
( 2 . 11 ). 

We start from Eq. (2.10), where we substitute G x and G 2 for F\ 
and F 2 , and F for G, and use Eq. (2.7). As a result we obtain 
another equation: 

[F, G,G 2 ] = G, [F, G 2 ) + [F, G,J G 2 (2.10*) 

We will think of F and G as noncommutative operators. Hence 
the order of multiplication in (2.10) is significant. Let us assume 
that the order of multiplication is that as in (2.10). This can be 
justified in the following manner. If G = H. Eq. (2.10) corre¬ 
sponds, at least in classical mechanics, to the rule of finding the 
time derivative of F X F 2 . When we are dealing with noncommuta¬ 
tive operators and wish to find the time derivative of F X F 2 , we 
must keep the order of multiplication as in (2.10), with F j always 
to the left of F 2 . 
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Now let us put G — G\G 2 in (2.10). Using (2.10*), we can 
write (2.10) in the form 

[FiF*. GiG 2 ] — F X G X [F 2 , G 2 ] + Fi [F 2 , G,]G 2 

+ Gi[F„ G 2 ]F 2 + [F u G,]G 2 F 2 (2.13) 

On the other hand, setting F — F\F 2 in (2.10*) and using (2.10), 
we obtain 

[F\F 2 , G,G 2 ] = G x Fi [F 2 , G 2 ) + C, [Fi, GJ F 2 

+ Fi [F 2 , G,] G 2 + [F u Gi] G 2 F 2 (2.13*) 

Thus for [FiF 2 , GiG 2 ] we have two different expressions, which 
must be equal irrespective of the forms of F and G. Equating 
these two results, we have 

(F,G, - G^t) [F 2 , G 2 \ = [F u G,] (F 2 G 2 - G 2 F 2 ) (2.14) 

This condition holds only in one case: if for any two opera¬ 
tors F and G 

[F, G] = c{FG-GF) (2.15) 

where c is an operator that commutes with any other operator. 
Only multiplication into a constant has such a property. Therefore c 
is a constant. We can easily show that c is pure imaginary. Indeed, 
we want the Poisson bracket of two real quantities to be real. 
Hence, if F and G are hermitian, [F, G] must be hermitian too. 
But by Eq. (4.8), Chapter II, 

[F, G] + = c(G + F + -F + G + ) = -5(FG-GF) (2.15*) 

If (2.15*) coincides with (2.15), then necessarily c = — c, which 
yields 

c ~~^r (2.16) 

where h' is real. Thus 

[F, Gj=-^(FG-GF) (2.17) 

Let us show that (2.17) satisfies all the properties of the Pois¬ 
son bracket (2.7) - (2.11). We immediately see that (2.17) satis¬ 
fies (2.7)-(2.9). Furthermore 

-ir {F\F 2 g - gf { f 2 )=yFi (f 2 g - GF 2 ) + (f,g - OF,) f 2 

which means that (2.10) is satisfied as well. Finally, to prove 
that (2.11) holds we must in the obvious identity 

FGL + GLF + LFG + LGF + FLG + GFL 

- FGL - GLF - LFG - LGF - FLG — GFL = 0 
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group the terms properly. We may thus consider it proved that 
the quantum Poisson bracket has the form (2.17). What remains 
to be found is the constant quantity h'. For the Poisson bracket 
to have the proper dimensions, h' must have the dimensions of 
action. The numerical value of h' can be found by first leaving it 
undefined, then constructing the appropriate operators, and finally 
comparing h' with the results of actual experiments, for example, 
comparing the eigenvalues of the energy operator with observed 
energy levels. We would then find that 

h' = h = -^, h = 6.624XI0~ 27 erg s (2.18) 

where h is Planck’s constant. We will from the start think of h' 
as equal to ft. 

Knowing the quantum Poisson bracket enables us to use the 
method of classical analogy to find the form of quantum operators. 
But, of course, the extent to which this analogy is justified will be 
determined by comparing theory with experiment. 

3. Position and momentum operators 

The form of the operator for a given physical quantity depends 
on the choice of variables for the functions on which the operator 
acts. The operator for the independent variable is always the 
multiplication into this variable. This follows from the requirement 
that the values of any physical quantity coincide with tne eigen¬ 
values of the corresponding operator (see Section 6 of Chapter II). 

Let us take, for example, the coordinate x as the independent 
variable for a system with one degree of freedom. If operators act 
on the functions of x, the position operator will reduce to the 
operator corresponding to multiplication into x. If we take another 
quantity for the independent variable, for instance, energy, the 
energy operator will have the “multiplication” form, whereas the 
position operator will have a new, more complicated form. 2 

How does one choose the independent variables for a system 
with several degrees of freedom? Can any combination of vari¬ 
ables be used (for instance, energy, one coordinate, and one com¬ 
ponent of the momentum vector for a system with three degrees 
of freedom)? We can answer these questions by reasoning as 
follows. The independent-variable operators are multiplication 
operators and thus commute with each other. But this means 
that 

The independent variables are those whose operators commute. 


2 See Section 6, Chapter I, Part II. 
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To decide which quantities (operators) commute and which do 
not, we use the analogy between the classical and quantum Pois¬ 
son brackets. 

The electron of classical physics is a mass point with three de¬ 
grees of freedom. Let us denote the position of an electron by the 
(Cartesian) coordinates 

x = x { , y = x 2 , z — x 3 (3.1) 

and the momentum components by 

Px = Pu Py = Pl< Pz = p 3 (3.2) 

The classical Poisson brackets of these quantities are 

[x k , Xi\ = 0, Ip*. Pi] = 0, [p k , x t ] = 6 kl (3.3) 

Let us try to formulate this in quantum terms. We will assume 
that the quantum Poisson brackets have the same form as (3.3) 
and that the 0 and 1 in the right-hand sides of (3.3) are the oper¬ 
ators of multiplication into zero and unity respectively. If, in 
addition, we assume that the coordinates of the electron can take 
on any real values from —oo to + oo, Eqs. (3.3) enable us to find 
the form of the position and momentum operators. 

First of all, Eqs. (3.3) show that the operators for x u x 2 , *3 
commute. Hence we must take x\, x 2 , x 3 for the independent vari¬ 
ables. This means that the operators will act on functions of the 
type 

ty(x, y, z) = ty(xu x 2 , x 3 ) (3.4) 

Second, Eqs. (3.3) yield 

-j-(p x x —xp x )\= 
x (p u y — yp u ) = 'P 

x (PzZ - Zpz) * = $ ’ (3.5) 

The momentum operators 

p. = -ih±- t Pz =-ih (3.6) 

are, as we already know, hermitian. They also satisfy Eq. (3.5), 
since after cancelling out h we have 

Ipx, x\ 4Jj = ■— (jn|») — x *= i|» (3.7) 

We can obtain equations for the y and z components in the same 
way. 
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To find the general form of the momentum operators p x , p y , p z 
we put 

+ p' u =- ih {j + %' Pz=- ih ~k+<iz ( 3 - 8 ) 
Equations (3.3) yield 

q k x t — x L q k = 0, k, l = 1,2,3 (3.9) 

which means that q x , q y , q z and x, y, z commute. Furthermore, the 
< 7 ’s must be hermitian. Hence they correspond to multiplication 
into real functions of x, y, z. For example, 

p' x 'V = — lh + q x (x, y, z) V (3.10) 


But p' x , p' y , p' z must also satisfy the conditions 
\Pk' Pi] ~ X (.PkPi PiPk) ~ ® 


or 


that is 








which yields 


dq t 

dx. 




dx, 


= 0 


(3.11) 


Hence q x , q y , q z are partial derivatives of one real function of 
coordinates, with the result that 


'X 1 


P y V = -ih-^r + 


p' z V = - it 


dy 

d\ I)' 


dx 

11 

dy 

<3/ 


dz 


V 

dz * 


(3.12) 


We now show that by transforming the function ijrwe can reduce 
the operators in (3.12) to the more simple form (3.6). Let us 
assume that 


p * x \|) = e lflh p' x \ J/ == 


(3.13) 

(3.14) 
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We seek the form of p’ x . We have 

“ iti 4J- = eW (“ + -g- *') = (3.15) 

If we compare (3.14) and (3.15), we obtain 


and similar equations for y and z. We see that 

P* X = P X ’ P*y ~ Py' Pl — Px (3.16) 

Thus the new operators are of form (3.6), and the relationship 
between p x , p y , p 2 and p x , p', p’ z is as follows: 

p x = e l f /k p x e~ i f lh 

Py = e'/Mp'g-WA 

p z = e m p ' z e-m (3.17) 

We will assume that this transformation was done at the very 
beginning. So we can view momentum operators p x , p y , p z acting 
on functions of coordinates as the operators (3.6). 

It is a distinctive feature of quantum mechanics that an opera¬ 
tor for a physical quantity can have different forms [Eqs. (3.6) 
and (3.12), for instance] and that in passing from one form to 
another [Eqs. (3.17)1 the wave function if> undergoes a transfor¬ 
mation [see Eq. (3.13)]. One might think that such arbitrariness 
would lead to an ambiguity in the laws of quantum mechanics. 
This is a false assumption. All quantities that can be compared 
with experimental results (eigenvalues of operators, for instance) 
are unambiguous. In Section 12 we will return to this question 
and find the transformations under which physical quantities are 
invariant. 

4. Eigenfunctions and eigenvalues 
of the momentum operator 

We see that using the method of classical analogy we can find 
the operators qf the Cartesian components of momentum, which 
are 

P X = ~ ih T7' Py == ~ ih T^' P* = ~ ih lk 

What are the eigenvalues and eigenfunctions of these opera¬ 
tors? To solve this problem, we first denote the eigenvalues of 
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the corresponding operators as p' x , p' y , p'. The eigenvalue equa¬ 
tions are 

- ih ^JT = PW l) 

- m ar = P> <3) ( 4 -2) 

The solution for the first of these equations is 

1 J 5 U) = /■<» (| y t z ) e lxp xl h (4 3 ) 

where f (1) does not depend on x. The solution is finite for all 
values of x if and only if p' is real. Thus the eigenvalues of p x 
form a continuous spectrum ranging from —00 to -foo. In a si¬ 
milar manner we can write the other two solutions: 

i|> (2) = / (2) (z, x) e VP u! h , \|) <3) — / ,3) ( x , y) e Up ^ h (4.3*) 

It is easy to see that Eqs. (4.2) have the general solution 

ijjd) — ^( 2 ) — ^j)(3) — ,jj ( 4 . 4 ) 

where $ — c exp [(i/h) (xp x + yp' y + zp')] with c a constant that 
may, however, depend on p', p', p'. The normalization condition 
will determine this constant. We will first consider the one-dimen¬ 
sional case. We put 

if — ce txp/n (4.5) 

The normalization condition [see Eq. (7.20), Chapter II] for 
this case is 

+ 00 

lim \ \AV\ 2 dx=l (4.6) 

Ap->0 " J 

—00 

where 

P + Ap 

A'T= ^ ij 5 (x, p) dp (4.7) 

p 

Equation (4.6) is satisfied if we consider c not depending on p. 
We then obtain 

A'T = 4*L e tx P ,h ( e ix — 1 ) 

= ce ix (p+ap/2)/a 2 L sin ( 4 ^.) (4,8) 


4 — 2186 
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Next 


1 

Ap 


+oo 

$ \twt?dx 


— 4 M 2 fi 2 

Ap 


+ oo 




since 


+ oo 

2 /tlcl 2 J = 


(4.9) 


+ oo 


S 


sin 1 1 
I 2 


dl — n 


(4.10) 


Returning to (4.6), we find that 


1 

(2nfl) 1/ * 


e ta 


(4.11) 


where a is a real quantity, which we may consider equal to zero 
without loss of generality. 

Hence the normalized function -sj) (x, p) is 

* {X ’ p)== li^ e ' XP/ * (4 - 12) 

The expansion of an arbitrary function in momentum eigen¬ 
functions will then be [according to Eqs. (8.21), Chapter II] 

+ oo 

f (*) = (2 ni)V» 5 e lxp/h <f> (p) dp (4.13) 

— oo 

where 

+ oo 

<p(p) "^5F 5 e ~ txP/Hf {x) dx (4 - 13,) 


Equations (4.13) and (4.13*) define the Fourier integral trans¬ 
formations , and <p (p) is called the Fourier transform of f(x) 
[quite naturally, f(x) is called the inverse Fourier transform of 
<P(P)]. 

If we now consider the three-dimensional case, we must write 
the normalization condition as 

Hm - -v- - 1 /- -- \[\\W P dx dy dz—\ (4.14) 

Ap x Ap y Ap, JJJ 

where 

p ' x + Ap * p 'y + Ap y K + Ap z 

\ dp' x J dp' \ dp'^ (x, y, z; p' x , p', p' z ) (4.15) 
/ / / 
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The normalization condition (4.14) will obviously be satisfied if 

$(x, y, z ; p ' x , p' y , = iK*; Px)*(y> Py)' K z; ?*) 

”'Si^ exp [T(*' , ' + ! "’» + z, ’b] < 4 - l6 > 

The expansion of an arbitrary function of the three coordinates 
in the momentum eigenfunctions will then be written in the form 
of a triple integral. 

5. Quantum description of systems 

In Section 5, Chapter II, we found that an eigenvalue of an 
operator is related to the corresponding eigenfunction by the 
formula 

^ dx 

* = - (5 - 1} 

\ dx 

We note that when x|> belongs to a continuous spectrum, we must 
take the proper differential instead of the eigenfunction. Thus by 
defining the eigenfunction we can find the corresponding eigen¬ 
value of the operator, which in turn is related to the physical 
quantity. In this sense we can speak of a function as defining the 
state of the system. 

The right-hand side of (5.1) retains its meaning when x|> is not 
an eigenfunction of L. We will clarify the physical meaning of this 
fact in the next chapter. 

We will call a wave function a function x|> that defines the state 
of the system. Let us illustrate the important concept of a wave 
function as a means of describing a system’s state. As we have 
shown in the previous section the function 

* y< z ) = Ti^TT exp [y (xp' x + yp' y + zp' z )] (5.2) 

is the simultaneous eigenfunction for all three momentum compo¬ 
nents p x , p y , Pz■ It therefore defines the state of the electron with 

Px = P*’ Py = Py’ Pz = Pz ( 5 - 3 ) 

The other quantities (for instance, the coordinates of the 
electron) do not have definite values in the state of the electron 
with a wave function (5.2) since is not an eigenfunction of the 
position operator. 

Hence the quantum description of the state of the electron has 
the feature that only one group of quantities (for instance, 


4 * 
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px, Py, Pz) can have definite values; the other group (in our 
example x, y, z) remains undetermined. This result agrees with 
the fact mentioned in Chapter I, the impossibility of precise and 
simultaneous measurement of all the quantities that in classical 
physics characterize the state of the electron. 

But how does one decide which quantities can be measured 
simultaneously and which cannot? Let us reason in the following 
manner. From the experiment we gain a knowledge of the state 
of the electron, that is, of a certain wave function tf>. If the mea¬ 
surement of two quantities L and M gives definite values X and p, 
then, according to what has just been said, ^ will be an eigen¬ 
function of both L (with eigenvalue A,) andAf (with eigenvalue p). 
But for L and M to have simultaneous eigenfunctions both must 
satisfy certain conditions, which we will now examine. 

6. Commutativity of operators 

Let ip = ip (x; X, p) be a simultaneous eigenfunction 3 of opera¬ 
tors L and M: 

Li)3 = 

Afi|) = ptj> (6.1) 

We operate on the first equation with M and on the second 
with L: 

MLty = XMty = Apip 
LMty — pLij> = pto|> 

Hence 

MLty(x; X, p) = LAfiJ) (x; X, p) (6.2) 

We assume now that all the simultaneous eigenfunctions con¬ 
stitute a complete set. Then any other function tj)(x) can be 
expressed as a series (or an integral) of the type 

^ (*) = E c (*. F) 'l 5 (*; K P) (6.3) 

Since (6.2) holds for each term in expansion (6.3), it holds for 
the sum as a whole provided the series for ML t|> converges. We 
see that for an arbitrary t|>(x) 


MLyjf — LM\ J> 

(6.4) 

ML — LM — 0 

(6.4‘) 


8 Letter x denotes either the independent variable or the set of all the inde¬ 
pendent variables of a function. 
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that is, L and Af commute. We have thus proved the following 
theorem. 

If the simultaneous eigenfunctions of two operators L and M 
form a complete set, the operators commute. 

We will now prove the converse theorem: 

If two operators L and M commute, they have simultaneous 
eigenfunctions. 

Let eigenvalue X of operator L have corresponding to it one or 
several eigenfunctions t|5(x; X, k), where k denotes the different 
functions for one specific value of X. The most general solution 
of the eigenvalue problem 

Lo|> = Xij) (6.5) 

is 

<|> == £ c (k) \|> (x\ X, k) (6.6) 

Operating with M on the equation 

Lty(x-, X, fc) = Xil>(x; X, k) (6.5* 

and using the fact that L and M commute, we obtain 

AfLij) = L (AfiJ)) = XMi|> (6.7) 

The function = Afi|) is thus the eigenfunction of L correspond¬ 
ing to the eigenvalue X. Hence it can be expressed as a linear 
superposition of eigenfunctions t(x; X, k')\ 

M|> (x\ X, k) = £ M (k\ k) if (jc; X, k') (6.8) 

w 

where obviously the expansion coefficients M(k', k) depend on k' 
as well as on k. With this in mind we can construct the linear su¬ 
perposition of type (6.6) that will also satisfy the equation 

iVf-iJj = pij) (6.9) 

We substitute (6.6) into (6.9) and use Eq. (6.8). Equating the 
coefficients of ^(x; X, k) yields a set of s equations 

ZM(k, k')c(k') = nc(k) (6.10) 

w 

where s is the number of possible values of k for a given X. In 
other words, s is equal to the degree of degeneracy of X. If we 
denote the solutions of Eqs. (6.10) as 

c“>(fc), c< 2 >(/ 1 ) .c< s > (k) (6.11) 

and the corresponding values of p as 

Pi, P 2 .Pi (6.12) 
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then the functions 

(x; X, p s ) = Z c is) (k) t|> (x; X, k) (6.13) 

k 

will be the solutions to both (6.5) and (6.9), which means that 
they are simultaneous eigenfunctions of operators L and M. The 
proof is complete. 

We assumed that X was s-fold degenerate with s finite. This, in 
turn, leads to a finite number of eigenfunctions corresponding to 
each value of X. The theorem also holds for the case when s is 
infinite. 

What is the physical meaning of these theorems? It is the 
following: 

The commutativity of operators expresses the possibility of 
measuring the corresponding physical quantities simultaneously. 
Or , in other words, the noncommutativity of operators makes a 
simultaneous measurement of the corresponding quantities im¬ 
possible. 

An example of two commutative operators with simultaneous 
eigenfunctions was given in Section 4 of this chapter. 

7. Angular momentum 

As an example of noncommutative operators let us examine 
three operators 

m x = yp z — zp y 

m y — Zp x — Xp z 

m z — xp y — yp x (7.1) 

These are composed of position and momentum operators in the 
same way as angular momentum is in classical mechanics. Later 
in examining the quantum equations of the motion of the electron 
we will see that (7.1) can indeed be interpreted as angular mo¬ 
mentum operators. 

We set up the quantum Poisson bracket involving the operators 
(7.1) and the position and momentum operators. We find that 

[m x , x] = j- (m x x — xm x ) = 0 (7.2) 

This is obvious since m x does not contain differentiation with 
respect to x and thus commutes with multiplication into x. We 
also find that 

[m x , y] = — z [ Py , y) — — z 

[m x , z] = y[p 2 , z] =y (7.2*) 
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where we have used the properties of the Poisson bracket known 
from Section 2. In a similar manner 

[m x , p x \ = 0 
[m x , p y ] = — p z 

[m x , p z \ = p y (7.3) 

If we use (7.2) and (7.3) to find the Poisson bracket for two 
different components of angular momentum, we obtain 

[m x , m y ] = [m x , zp x — xp z ] 

= [m x , z]p x — x [m x , p z ] 

= yp x ~ XPy 
— — m z 

Hence 

[m y , m z ) = — m x 
[m z , m x ] — — m v 

[m x , m y \ = — m 2 (7.4) 


We see that all these relationships correspond exactly to the 
classical ones. 

Our task now is to find the eigenvalues and eigenfunctions of 
the operators m x , m y , m z . The eigenvalue equation for m z iis 

T < 7 - 5 > 

with m' being an eigenvalue of m z . If we use cylindrical coordi¬ 
nates p, <p, z with 

x = pcos<p, t/ = p sin qp 


then Eq. (7.5) reads 


b <5i(> 
i dtp 




The solution is of the form 

qj = i]) 0 (z, p) e lm ^ /h 


(7.5*) 

(7.6) 


This is a single-valued function of a point in space only if it 
is a periodic function of <p with period 2 jc, which yields 

m' = m 3 h, m 3 = 0, ±1, ±2,... (7.7) 


We have thus found the eigenvalues and eigenfunctions of m z . In 
the same way one can find them for the other two operators. To 
compare them we will return to the rectangular Cartesian coordi- 
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nates. Function (7.6), which we denote $ 3 , and the eigenfunctions 
■t|?i and \f »2 of tn x and m y will then be of the form 

*. = /.(*, ^y r +^)(y + iz) m ' 

$2 = f 2 (y> Vz 2 + X 2 ) (z + ta)”’ 

^3 = / 3 (z. 7)(.x + iy) m ‘ (7.8) 

and the eigenvalues will be 

m' x = m l H, m' y — m 2 h, m' z — m 3 h (7.9) 

where mi, m 2 , m 3 are integers. 

We have come to a conclusion that at first glance seems para¬ 
doxical: the angular momentum components in any direction can 
assume, when measured, only integral values that are multiples 
of a definite quantity fi. This seems even stranger in view of the 
fact that the components of a vector in two infinitely close 
directions differ by an infinitely small quantity. 

But this paradox is easily explained. First let us note that the 
only simultaneous eigenfunction of m x , m y , m z corresponds to the 
simultaneous eigenvalues 

m\ — m 2 — m 3 = 0 (7.10) 

and is 

^ = ^i=='l >2 = 'l’3 = f(r), r = (x 2 + y 2 + z 2 ) V ‘ (7.11) 

In this case the angular momentum vector (and hence its com¬ 
ponents in any direction) is zero, which implies that a paradox 
does not exist. But if one of the eigenvalues is nonzero, the opera¬ 
tors m x , m y , m z have not a single simultaneous eigenfunction. 
Hence a state of the electron in which two or three components of 
angular momentum have definite values simultaneously is impos¬ 
sible. This means that only one of the components can be integral. 
What is the physical significance of this fact? To measure a 
component of an electron’s angular momentum in a definite di¬ 
rection one must influence the electron in some way, say, by 
switching on a magnetic field in this direction. This influence 
“tunes” the electron in such a way that its angular momentum 
component in the direction of the field assumes integral values. 
Other components remain undefinable because it is impossible to 
measure them without changing the direction of the field, that is, 
without getting the electron “out of tune“. We conclude that the 
properties of angular momentum resulting from the theory under 
consideration express the inevitable influence of the measuring 
process on an object. 
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8 . The energy operator 

In the classical theory for a vast variety of systems the time 
dependence of the state of a system (equations of motion) is 
defined by introducing the Hamiltonian function, which is the 
total energy of the system. In quantum mechanics too we can 
introduce an energy operator, commonly called the Hamiltonian, 
which defines the time dependence of the state of a system (this 
will be proved in Section 13 of this chapter). For this reason the 
choice of a specific Hamiltonian is an important step in construct¬ 
ing the theory. When this is done, the choice of operators for 
the other physical quantities (angular momentum, for example) is 
more limited. In classical mechanics we take the simplest quan¬ 
tities (position and momentum) and construct different combina¬ 
tions possessing “advantageous” properties (for instance, remain¬ 
ing constant in the process of motion). In quantum mechanics 
too we take the simplest operators and construct combinations 
that have simple properties and allow an obvious interpretation. 
When speaking of the properties of operators, we have in mind 
for the most part their commutativity with other operators and 
their time dependence. Since the time dependence is associated 
with the form of the Hamiltonian (see Section 13 of this chapter), 
it is evident that the choice of “advantageous” combinations of the 
simplest operators depends on this basic operator. 

The classical Hamiltonian function has different forms depend¬ 
ing on whether special theory of relativity is taken into account. 
Only for the one-body problem was the relativistic Hamiltonian 
function found in explicit form. For the many-body problem this 
proved to be impossible. The situation is similar in quantum me¬ 
chanics. Here too the relativistic Hamiltonian was found only for 
the one-body problem, and it differs drastically from the nonre- 
lativistic Hamiltonian. We will study the relativistic case in 
Part V, which is devoted to Dirac’s theory of the electron. For 
the present we will deal with the Hamiltonian ignoring the rela¬ 
tivistic effects. 

In the classical theory the kinetic energy expressed in terms of 
the components of momentum in a rectangular Cartesian coor¬ 
dinate system is 

T =-L(Pl + Pl + Pl) <»•*> 

If we consider p x , p y , p z as operators (3.6), formula (8.1) will 
become an operator, which we can call the kinetic-energy opera¬ 
tor. We note that if in (8.1) we had used, say, a spherical coordi¬ 
nate system instead of a Cartesian one, then 


(8.2) 
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If we had then interpreted p r , po, Pv as —ih(d/dr),—ih(d/dQ), 
—ifi(d/dy), we would have obtained an operator T* that would 
not coincide with T. For this reason let us assume that when 
passing from a classical function to a quantum mechanical oper¬ 
ator we must use only rectangular coordinates. If a classical 
formula (involving components in a rectangular coordinate sys¬ 
tem) does not contain factors that become noncommutative when 
transformed into operators, the transformation is unique. But, of 
course, it remains to be proved by comparing theory with experi¬ 
mental practice whether such classical analogy is legitimate. 

The kinetic-energy operator can be expressed in terms of the 
Laplacian operator. If we substitute (3.6) for p x ,p y ,p z , we obtain 

= (8-3) 


Naturally, after the form of the operator is established, we can 
transform to any coordinates. For instance, in spherical coordi¬ 
nates 




+ 


d 2 T|; 


dq > 2 


(8.4) 


If we introduce the operators 

p r = — p d =-ih-gQ, = — ih (8.5) 


we can write the kinetic-energy operator as 

T = ^{^Pr r2 Pr+7^P» SinQ P° + 7^Pl} M 


This expression differs from (8.2) only in the order of the non¬ 
commutative multipliers; if they were commutative, the two would 
have coincided. 

We know that the eigenvalues of the Laplacian operator are 
negative. Hence the eigenvalues of the kinetic-energy operator are 
positive, as they should be. 

For the eigenfunctions of the kinetic-energy operator we can 
take the simultaneous eigenfunctions of p x , py, Pz, which, as we 
know, are of the form 

,8j> 

Any function of type (8.7) in which the sum of the squares of 
the parameters p' x , p' y , p' z has a definite value 2mT': 

p' x 2 + p' u 2 + p? = 2mr 


( 8 . 8 ) 
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and also any superposition of such functions (a sum or an inte¬ 
gral) is an eigenfunction of the kinetic-energy operator corre¬ 
sponding to the eigenvalue T = T. Hence we have an eigenvalue 
of infinite-fold degeneracy. Functions of type (8.7) can be used to 
compose superpositions that would at the same time be eigenfunc¬ 
tions of other operators. This implies that these operators com¬ 
mute with each other and with the kinetic-energy operator. In the 
physical sense this means that the state of the electron is not fully 
defined when we specify the kinetic energy alone. Hence we must 
indicate values of other quantities, say, momentum. 

The kinetic-energy operator for a free electron is at the same 
time its Hamiltonian. For an electron in an external field with a 
potential energy U (x, y, z) we can by analogy with the classical 
theory write the Hamiltonian in the form of a sum of the kinetic- 
and potential-energy operators: 

h = (p 2 * + pI + pI) + u (*> y> z ) (8-9) 

Here the operator U ( x , y, z) acts on functions of coordinates and 
thus yields multiplication into U(x,y,z). The eigenvalue equation 
for H is 

-^Vl f + U(x,y,z)1t = E< t < 8 - I0 > 

where E is the total energy of the electron in the field. This equa¬ 
tion was proposed in 1926 by Schrodinger and is called the 
Schrodinger equation. We will examine the equation and its solu¬ 
tions more closely in Part II. For the present we will say that, 
apart from certain details, its corollaries are borne out by experi¬ 
ments, which proves the validity of the original hypotheses. 

The Schrodinger equation can serve to describe the behaviour 
of an electron in an electrostatic field. It is only natural then to 
try to generalize it for the case of a magnetic field. But it appears 
that the classical model of the electron as an electrically charged 
mass point is not sufficient to explain the electron’s behaviour in 
a magnetic field and it is necessary to assign a definite magnetic 
moment to the electron. In Part V we will perform this generali¬ 
zation on the basis of Dirac’s theory of the electron. 

9. Canonical transformation 

We have seen that the state of the electron can be described by 
a function of coordinates or other independent variables, for 
instance, the components of momentum. The transition from one 
set of independent variables to another is done by canonical 
transformation. 
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Let a wave function expressed in terms of coordinates be 
t|?(x, */, z) or simply x|>(x) if by x we denote the set of all three 
coordinates. We assume that we are dealing with an operator L 
that has eigenvalues X and a complete set of eigenfunctions 
<p(x, A,). Then t|>(x) can be expanded in a series in the eigenfunc¬ 
tions <p(x, A,): 

^ (*)= Yi c ^ ^ \ c W (*» ^ d (9- 0 

a 

where the expansion coefficients [both c(A,*) and c(X)] are defined 
in terms of $ (x) thus: 

c (X) = ^ q> ( x , X) yip (x) dx (9.2) 

Function t|?(x) is specified by the set of coefficients c(Xk) and 
c(X). For this reason, if (x) describes the state in terms of x, 
then c(X) describes it in terms of X. Also, if t|> (x) is normalized, so 
is c(X) because according to the completeness condition (see 
Section 8 of the previous chapter) 

J 1 * (x) I 2 dx = £ | c (X k ) f + \ | c (X) f dX (9.3) 

k 

A state in which X = X n is described in terms of X by the func¬ 
tion 

c(X n )= 1 

c (A.) = 0, X^X n (9.4) 

If in a given state we find that X — X' where X' belongs to a 
continuous spectrum, the expansion coefficients c(Xk) in (9.1) 
must be set equal to zero and the integral must be the Stieltjes 
integral. We write it in the form 

^ (x) — <p (x. A/) = jj qp (x, X) d K c (X, X') (9.5) 

where 

c(X,X')= 1, X>X' 

c{X, A/) = 0, Jl<V (9.6) 

Formula (9.2) can be considered to be the expansion of c(A,) in 

the functions _ 

<p + (A,, x) = cp(x, X) (9.7) 

Here the expansion coefficients are the functions ij)(x) defined by 
(9.1). Later we will see that the <p+ are the eigenfunctions of the 
operator x in terms of X. Thus a description in terms of x is equi¬ 
valent to a description in terms of X. 
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Now let us see how operators change when we transfer from 
one set of variables to another. We first take the operator L whose 
eigenfunctions we have used in expansions. We apply it to (jc). 
Since qp(x, A,) is an eigenfunction of L, 

Lqp ( x , A) = Aqp (x, A) (9.8) 

and we obtain 

(x) = Yj *-fe c (A*) <p (x, A a ) + Ac (A) q> (x, A) dk (9.9) 

ft 

Hence the transition from ty(x) to L\|3 (jc), that is the application 
of L, has corresponding to it the transition from c(A) to Ac (A), or 
the multiplication into A. Consequently, the operator L expressed 
in terms of the independent variables A, its eigenvalues, is simply 
the multiplication into A, which is as it should be. Indeed, we have 
seen in Section 3 that the operator corresponding to the indepen¬ 
dent variable is the multiplication into this variable. 

Now instead of L let us take another operator M and apply it 
to \|)(x). For simplicity let us assume that L has only a discrete 
spectrum. So the series expansion of t|)(x) in the eigenfunctions 
of L will be 

$ (x) — Z c (A*) qp (x, As) (9.10) 

k 

Applying M to t|>(x), we find that 

Af*W = Zc(A*)Mcp (x, A*) (9.11) 

k 

In turn we expand Af<p(x, A*) in the <p (at, A„): 

(X, Aft) = I (A. | M | Aft) q> (x, A„) (9.12) 

n 

where (A n |M|Aft) are the expansion coefficients: 

(A„ | M | A*) = ^ <p (x, A„) Afqp (x, A*) dx (9.13) 

Substituting (9.12) into (9.11), we obtain 

(*) = £ c' (A„) qp (*, A„) (9.14) 

n 

with c' (An) defined as 

c' (A„) = Me (A„) = Z (A„ | M | A*) c (A*) (9.15) 

k 

Hence the transition from t|)(x) to M\p(x) has corresponding to 
it the transition from c(A„) to c'(A„) = Afc(A n ). Therefore M 
expressed in terms of A has the form (9.15). 
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If L has a continuous spectrum as well as a discrete spectrum, 
instead of (9.10), (9.12), (9.14), and (9.15) we have 

♦ (*) = Yj c *P **) + S c W ‘P (*’ *) dX ( 9 - 10 *) 

ft 

Mf (x, X) = £ (*» I M I *■) <P (*» *"> + \ (VI M | X) <p (x, V) dX' (9.12*) 

n 

Aty W = Yj c ' M *p **) + S c ' (*>) ‘P (*» *) < 9 - 14 *) 

k 

c' (X) == Afc (X) = £] (X | Af | X ft ) c (A,*) + J (A, 1 Af | X') c (V) dX' (9.15*) 

k 

where in (9.12*) and (9.15*) X is an eigenvalue belonging either 
to the continuous or to the discrete spectrum. We can define 
(X|Af|X') in the same way as in (9.13). It could happen, however, 
that the integral in the expression for (X|Af|X') does not have a 
definite value. This means that in terms of X there is no kernel 
for M. In such a case we construe M in terms of X as an operator 
that maps the expansion coefficients c(X) of i|>(x) into the expan¬ 
sion coefficients c'(X) of Ah|j(x) even if c'(X) is not given by 
(9.15*). 

Let us illustrate this. We assume that M is the position opera¬ 
tor, which means that when M is applied to t|>(x), the result is 
multiplication into x. What we want to find is the expression for 
x in terms of X. The eigenvalue equation is 

£(X|*|X*)c(X*)+ $(X|x|X')c(X')dX' = xc(X) (9.16) 
a 

It is easy to check that the solution is 

c (X) = «p+ (X, x) = <p (x, X) (9.17) 

with <p (x, X) being an eigenfunction of the operator L in terms 
of x. Indeed, if we recall the condition 

(X | x | X') = (X' | jc | X) (9.18) 

which expresses the h ermiticity of the operator x, and if we re¬ 
place c(X) by <p(x, X), we see that the complex conjugate of 
Eq. (9.16) is 

£(X fc |*|X)q>(x, X*) + J(X'lxlX)cp(x, X')dX' = x<f(x, X) (9.19) 

ft 
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But this is precisely the expansion of the product jcqp (jc, A.) in a 
complete set of eigenfunctions <p(x, %!). This also holds when 
instead of x we take an arbitrary operator M. Thus 

The eigenfunctions of an operator M in terms of L are complex 
conjugates to the eigenfunctions of L in terms of M. 

The result remains valid when operators do not have kernels. 

10. An example of canonical transformation 

As an example of canonical transformations we will consider 
transformations of the momentum and position operators, p and x. 
We know that p in terms of x, that is, 

P = - iH -k O 0 - 1 ) 

has eigenfunctions of the form 

* {x ■ p)= ldi^ e “‘ P ' , (l0 - 2> 

Let us examine the form of x in terms of p. The operator x, 
by definition, maps f(p), which is the Fourier transform of the 
function ij>(x) expressed as the Fourier integral 

+ oo 

♦ (*) = \ f (P) eixp,h dp (10.3) 

—*oo 

into another function f'(p) such that 

+ oo 

(*) = (2jt g )V , $ f' (P) e ‘ xplh dp (10.4) 

— oo 

holds true. If we integrate by parts, we obtain 

S np)de ' x ’"'=-6SF \ihf e“p»dp 

— OO — oo 

which yields 

f'(p) = ih-^ (10.5) 

We see that x in terms of p is 

x = 00 . 6 ) 

This agrees with the expression for the Poisson bracket 
[p, x]=*-j{px — xp)= 1 


(10.7) 
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because 

The eigenfunctions of x in terms of p are 

(10 - 9) 

Here p stands for the independent variable and x for a parameter, 
whereas in (10.2) the roles of p and x are interchanged. 

The function 

i|) (x) = —!—n- e ixplH (10.10) 

of x described a state of the electron with a definite momentum, p. 
However, a state with a definite position, x\ was described in 
terms of x by a proper differential 

dF(x,x') (10.11) 

with 

F{x, x') = l, x>xf 

F(x, *') = 0, (10.12) 


On the other hand, in terms of p a state with a definite position, x, 
is described by 


f(p) 


1 

(2 nfi) Vl 


g-ixp/h 


(10.13) 


and with a definite momentum, p', by a proper differential 

dF(p,p') (10.14) 

We must note that F(p,p') depends on p and p' in the same way 
as F(x, x') depends on x and x'. 

It can easily be concluded, therefore, that a transition from one 
representation to another is made, as in the general case, via the 
Fourier integral, because for a state of the electron with a definite 
position, x — x', 

f ( P) = (p, x') = J \J) + (p, x) d x F (x, x') (10.15) 

and for a state with a definite momentum, p = p', 

il> (x) = ♦ (x, //)“$♦ (x< p) d p F (p, p') (10.16) 


11. Canonical transformation as an operator 

A canonical transformation is most conveniently written in 
symbolic notation. We will denote by S(x, X) the operator that 
maps a function c(X), which describes a state in terms of X, into 
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a function \|j(x), which describes the same state in terms of x. 
We can then proceed to write the expansion (9.1) in symbolic 
form: 

(n.i) 

What distinguishes this new operator from all the operators 
that we have studied previously is that it transforms a function 
of a definite independent variable into a function of another inde¬ 
pendent variable, both functions describing the same state in terms 
of different variables. 

We can write the dependence of c(A) on $(*), Eq. (9.2), in the 
form 

c(X) = S~ l (X, *H(x) (11.2) 

Let us show that the inverse S -1 (A,, x) coincides with the hermitian 
conjugate S+(A, x). Together with the functions ty(x) and c(A) 
we will consider two other functions, q/(x) and c'(A), also related 
by Eqs. (11.1) and (11.2). We generalize the previous definition 
of a hermitian conjugate for the case of two independent variables 
and denote 5+(A,, x) as an operator that satisfies the condition 

\ [5 (x, A,) c ft)] dx = J [S + (A, x) (*)] c (A) dX (11.3) 


(In the case of a discrete spectrum we substitute a sum for an 
integral.) By (11.1) and the completeness condition the left side 
of (11.3) is 


j a|>' (x) ^ (x) dx — ^c' (A) c (A,) dX 

(1L4) 

The right sides of (11.3) and (11.4) coincide for any c(A) only if 

c' (A,) = S + (A,, x)oj)'(x) 

(11-5) 

for any ti>'(x). If we compare this with (11.2), we 

find that 

S -1 (A,, x) — S + (A,, x) 

(11.6) 

and consequently 


S(x, A,) S + (A,, x)=l 

01.7) 

S + ( A,, x)S(x, A) == 1 

(11.7*) 


As we know, an operator that obeys these conditions is called 
unitary. Thus 

A transformation from one set of variables to another is done 
by means of a unitary operator. 

Let us see how we can use S to express a canonical transfor¬ 
mation of an operator corresponding to a definite physical quan- 


5—2186 
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tity M. If M in terms of x maps t|j(x) into ij/(x), that is, if 

•M 5 ' (jc) = Af (jc) t|3 (jc) (11.8) 

then the same operator M in terms of A will, as we already know, 
map c(A) into c'(A): 

c'(A) = M(A)c(A) (11.9) 

But 

c' (A,) = S + (A,, x) * (x) = S + (A,, x) M (x) 1 J 3 (x) (11.10) 

and 

i|>(x) = S(x, A)c(A) (11.11) 

Substituting (11.11) into (11.10), we obtain 

c ' (A) = S + (A,, x) M (x) S (x, A,) c (A.) (11.12) 

which after comparison with (11.9) yields 

M (A,) = S + (A,, x) M (x) S (x, A) (11.13) 

Hence the transformation (9.11) of ^(x) corresponds to the trans¬ 
formation (9.13) ofM(x). 

Obviously, if two unitary transformations are applied in suc¬ 
cession, the result is a third unitary transformation. Indeed, instead 
of first transforming from a A-representation to an x-represen- 
tation by means of a unitary transformation (operator) S (x, A) 
and then from the x-representation to a p-representation by means 
of T(yi,x), we can transform directly from the A- to the p-repre- 
sentation by means of 

U (P. A) = f (p, x) S (x, A) (11.14) 

which clearly is a unitary transformation (operator) too. Gen¬ 
erally speaking, S(x, A) has a kernel. If we compare (11.1) 
with (9.1) and (11.2) or (11.5) with (9.2), we can easily see that 

kernel S(x, A) = cp(x, A) (11.15) 

kernel S + (A, x) = <p + (A, x) = <p (x, A) (11.15*) 

The kernel of the operator of the unitary transformation from 
L-representation (that is, from X-representation) to x-represen¬ 
tation is an eigenfunction of the operator L in terms of x. 

12. Unitary invariants 

In the process of finding the momentum operator in Section 3 
we encountered the following fact. Not only the operator 

/>** = -k — l, 2, 3 


( 12 . 1 ) 
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but the operators 

w —«-3 (,2 ' 1 * ) 

as well obeyed all the conditions resulting from the form of the 
Poisson bracket. The functions and tj/ were related thus: 

= ( 12 . 2 ) 

and the operators pk and p' k thus: 

p' k = e -m Pk eW» (12.3) 

The real function /( X\, x 2 , *3) remained undefined. 

If we compare (12.2) with (11.11) and (12.3) with (11.13), we 
see that Eqs. (12.2) and (12.3) represent a unitary transformation 
of a special type. Namely, the transformation is not associated 
with a change in variables. It results in multiplication by a func¬ 
tion of the independent variables, the function being of modulus 
unity. The operator of this transformation 


S = e tllh (12.4) 

is called a phase factor. 

We know that an operator for a given physical quantity can be 
described in terms of different independent variables, or, so to say, 
be in different representations. Now even when the representation 
is chosen we are left with an arbitrary phase factor. Both the 
transformation from one representation to another and the intro¬ 
duction of a phase factor are determined by unitary transfor¬ 
mation. This implies that any two representations of an operator 
are interrelated by means of a unitary transformation. We can 
say that 

The form of an operator for a given physical quantity is de¬ 
termined by the properties of the quantity only up to a unitary 
transformation. 

Since the properties of physical systems cannot have undefined 
elements, they must be expressed by mathematical relationships 
that remain invariant under unitary transformations. It is the 
invariants that play an important role in the theory. 

What do we mean by unitary invariants? The spectrum of eigen¬ 
values of an operator is one of such invariants. So is the hermit- 
icity of an operator. Indeed, by Eq. (11.13), 

M(K) = S + M(x)S 


5 


(12.5) 
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Hence, according to the rule of finding the hermitian conjugate 
of a product of operators, 

M + (k) = S + M + (x) (S + ) + = S + M + (x) S (12.6) 

We come to the conclusion that from 

M + (x) = M (x) 

follows 

M + (k) = M (k) 

Also, the equations 

Af (x) ij5 (x) = pt|) (x) (12.7) 

M(A,)c(A,) = pc(A,) (12.8) 

are equivalent, since one can be obtained from the other by means 
of the transformation 

i|> (x) = 5c (k) (12.9) 

For this reason the eigenvalues p in (12.7) and (12.8) are the 
same. This fact is closely related to the completeness condition, 
according to which 

J 'FW V (x) dx = J c(X) c' (k) dk (12.10) 

for any two pairs of functions [t||(x) and c(k), and tj/(.x) and 
c'(k)] that satisfy Eq. (12.9). This leads us to believe that an 
integral of type (12.10) is a unitary invariant. So is the follow¬ 
ing integral: 

J W(x) M (x) q: (x) dx = J cWM (k) c (k) dk (12.11) 

whose physical meaning will be discussed in the next chapter. 

Finally, any algebraic equation between linear operators, for 
example, 

N(x) = M(x) + L(x) 
or 

N (x) — M (x) L (x) 

is left invariant by a unitary transformation, since if all three 
operators L(x), M(x), N(x) are subjected to the same transfor¬ 
mation, the new operators L(k), M{k), N(k) will be related by 
the same equations. For instance, in transforming the operators x 
and p x to any new variables, the Poisson bracket 

Y (PxX — xp x ) = 1 


remains equal to unity. 
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13. Time evolution of systems. 

Time dependence of operators 

When we considered the operators for different physical quan¬ 
tities we did not account for time development. Yet in classical 
mechanics all quantities depend on time. What is the equivalent 
of this concept in quantum mechanics? 

We know that an operator for a given quantity allows for 
different mathematical representations, and the choice of a specific 
representation remains arbitrary. Let us consider a representation 
in which the mathematical form of the operators remains the same 
for all instances of time t. Such a representation is possible only 
if the eigenvalue spectrum of an operator does not change in time, 
as is usually the case. For instance, the momentum operator p x 
can be represented as — ih(d/dx) for any t. If at time t — 0 the 
momentum had a definite eigenvalue, for instance, p x = p' x , so that 

M = P% t==0 (13.1) 

then at time t > 0 it can, generally speaking, take on any other 
eigenvalue or become undefined. In our example 

Px^^Px' l». *>° (13.2) 

Since the form of p x is assumed to remain unaltered, it is the form 
of the wave function ip that changes. Thus 

If we choose a representation in which the mathematical form 
of the operators does not change with time, the state of the 
system must be described by a time-dependent wave function. 

This time evolution can be symbolically written as 

i|) (x, t) = S ( t ) $ (x, 0) (13.3) 

with the time evolution operator S{t) being a continuous function 
of the time variable t and turning into the identity operator at 
time zero: 

S(0)=1 (13.4) 

We will assume the time evolution operator to be unitary: 

S + (t)S(t)= 1, S(/)S + (0=1 (13.5) 

so as to preserve the normalization properties of our wave func¬ 
tion as time passes: 

^ ip ( x, t) ip ( x , t) dx — ^ t|> (ar, 0) ip (x, 0) dx (13.6) 

Let us now turn to another way of representing operators. We 
recall that if a function is transformed by means of a unitary 
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transformation of type (13.3), all operators will undergo a unitary 
transformation 



V {t) = S + (/) LS ( t) 

(13.7) 

and the equations 


i|/ (x, t) = Lx J> (x, t) 

(13.8) 


(*) = L' (0 ^ (*) 

(13.8*) 

are equivalent if 


xlp' (x) = S xj/ (x, t) 

(13.9) 


^ (x) = S + iJ> (x, t) 

(13.9*) 


This second representation of operators lies in the fact that the 
operator corresponding to the quantity L at time t > 0 will be a 
time-varying operator L'(t ) defined by (13.7). 

The difference between the two representations corresponds to 
the difference in the ways of describing a system. In the first the 
state of the system is described by a wave function of coordinates 
and time, and in the second by a wave function of coordinates 
exclusively, with time entering, if at all, as a parameter. If the 
initial state (at / = 0) was 

$ (*) = (*> 0) 

then in the first representation the state at time / > 0 will be 

*(x,0 = S(t)*(x) (13.3*) 

and in the second it will remain 

ip(x) 

To see if L will have a definite value at time t > 0 we must 
check in the first representation whether tj)( jc, <) of (13.3*) will 
be an eigenfunction of L, and in the second whether tf» (x) will be 
an eigenfunction of L'(t). Hence in the first representation the 
time dependence is thrown onto the wave functions, and in the 
second onto the operators. The result is the same whatever repre¬ 
sentation we use. 

To find the time evolution operator S(t) let us adopt the second 
representation, in which the operators depend on time. Since in 
this case the time dependence of the state of a system results in 
a change in the form of the operators, it is natural to interpret 
the time derivative of an operator for a given quantity as an 
operator corresponding to the time rate of change of this quantity. 
Such an interpretation can be taken as a definition for the time- 
rate-of-change operator. We will now find the total time derivative 
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of L'(t) keeping in mind the possibility of an explicit time depen¬ 
dence of L. We obtain 

^jp-=S + LS + S + ^-S + S + LS (13.10) 

where the dot denotes differentiation with respect to time. Now we 
must return to the first representation. For this we apply to the 
operator (13.10) a transformation that is the inverse of unitary 
transformation (13.7) and define dL/dt as 

= S dL ^ - ‘l s + (13.11) 

Substituting (13.10) into (13.11), we obtain 

^ = ^L + SS + L + LSS + (13.12) 

where we used the unitarity of S, 

SS + = 1, S + S= 1 (13.13) 

Differentiation of the left equation in (13.13) with respect to time 
yields 

SS + -fSS + = 0 (13.14) 

The last equation shows that the operator iSS + , which we will 
denote as 

-Jr- H*Z5£iSS + (13.15) 

will be hermitian. With the help of (13.14) and (13.15), Eq. (13.12) 
can be translated into 

w =j w + t^ l ~ lh ^ < 13 - 16 ) 

The second term on the right is the quantum Poisson bracket so 
that the time derivative of L is 

in-=w + [ H '> L ) < 13 - 17 > 

This coincides with the classical expression for the time derivative 

of L provided that H* is the classical Hamiltonian function H. 
We will assume that this is so from the very beginning. Irrespec¬ 
tive of the classical analogy this assumption stems from the law 
of conservation of energy and Bohr’s frequency condition. Indeed, 
according to the law of conservation of energy we must have 

= -£-(#*//-#//*) = 0 


(13.18) 
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provided that the Hamiltonian H has no explicit time dependence. 
This equation will hold for any mechanical system, that is for 
any form of H, only if 

— (13.19) 

The form of the function f(H) can be found with the help of 
Bohr’s frequency condition. If we first leave f(H) undefined, we 
will obtain for the frequency of the light emitted in the transition 
from the energy level E to the level E' the following expression: 

v = ^[/(£')-M£)1 (13.20) 

which coincides with experimental results only if f(E) = E. 

Conversely, if we were to assume that by classical analogy 
H* = H, then we could derive the law of conservation of energy 
and Bohr’s frequency condition. 

We can thus consider it proved that 

H = H* — iftSS* (13.21) 

which transforms the expression (13.16) for the time derivative 
of operator L into 

W = W + T^L-LH) (13.22) 

or 

-§ = -^ + [//, L) (13.23) 


These equations are called the quantum equations of motion. 

If we assume H to be known, Eqs. (13.3) and (13.21) give the 
law of the time evolution of the state of a system. Indeed, if we 
differentiate (13.3) with respect to time, then 


But 

which yields 


t|) ( x , 0) = S + (t) (x, t) 

— 5S + i|) (x, t) 


(13.24) 


Substituting for 5S+ its expression (13.21), we find that 

= 0 (13.25) 

This equation came to be known as the wave equation although it 
does not belong to the type of equations that in mathematics are 
called wave equations. 
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The wave equation (13.25) can also be obtained by an abstract 
reasoning. From the point of view of classical mechanics the 
energy of a system, H, can be considered, up to a difference in 
sign, to be the generalized momentum that is conjugate to time: 

H = — p t (13.26) 

If we transfer to quantum mechanics, by analogy with the opera¬ 
tors p x , p y , pz we can write 

Pt = -ih-§t (13.27) 

By equating the results of applying the operators H and —pt to a 
wave function tjj we arrive at Eq. (13.25). We could have said, 
however, that the form of, say, p x was found from the condition 
[p x ,x] = 1, whereas for energy and time the Poisson bracket was 
not considered. 

14. Heisenberg’s matrices 

We can set up the representation in which the time dependence 
is shifted to the operators (in Section 13 it was the second repre¬ 
sentation) in the following manner. Let 

'I’o (x), *M*).iM*). ••• (H.l) 

be a complete, orthogonal, and normalized set of functions, for 
instance, the eigenfunctions of some operator. We wish to find 
the solution tp ra (jc, t) to the wave equation 


— itl -4jj- = 0 

(14.2) 

the solution satisfying the initial condition 


t„(x, 0) = (x) 

(14.3) 

We can show that for each value of t the solutions 


%(x, t), ih (x, t), ..., ij>„ (x, t), ... 

(14.4) 


form a complete, orthogonal, and normalized set. This is so even 
if H depends on time explicitly. 

We now expand the function ij)(x, 0), which describes the initial 
state of the system under consideration, in a series involving the 
functions (14.1): 


oo 


0)= E M>„(*) 

(14.5) 

rt—0 


Then at time / > 0 the state will be described by 


oo 

^ (X, (x, t) 

n—0 

(14.6) 
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where c n are the same constants as in (14.5). If we take number n 
(which labels the function t| 3„) for the independent variable, the 
state will be described (in this representation) both at t = 0 and 
at t > 0 by the same function of rt, 

c (n) =• c n (14.7) 

Hence in this representation the time dependence is transferred 
to the operators, and to find the form of the operators in this 
representation it is sufficient to shift to the new variable, n. 

We can easily find the matrix (kernel) of an operator L in the 
n-representation. By the general formula (9.13), 

(n | L (t) |«') = $ i|3„ ( x, t) Li}v ( x, t) dx (14.8) 

Such a representation has the property that for the operator 
dLjdt, which is the time rate of change of L, the matrix elements 
are equal to the time derivative of the corresponding matrix ele¬ 
ments of L: 

-±(n\L\n') = (n\^\n') (14.9) 

This follows from the fact that the time dependence is shifted to 
the operators. Direct proof of (14.9) is given in Section 4, Chap¬ 
ter IV. 

In our discussion we assumed that the \|)„ form a discrete set, 
the set of eigenfunctions of an operator with a discrete spectrum. 
But this restriction is not essential. The functions could have been 
the eigenfunctions of an operator with a continuous spectrum, 
with n being a continuous parameter. Let us assume, for a moment, 
that the solution of Eq. (14.2), where we set 

♦ (*, OI<-o •=/(*) (14.10) 

can be represented in the form of an integral: 

'H*, x Q )f(x 0 )dx 0 (14.11) 

Equation (14.11) replaces Eq. (14.6), and ^(x, /; x 0 ) takes the 
part of tyn(x, t), parameter x 0 the part of the integer n, and f(x 0 ) 
the part of c„. If we compare the definition (13.3) of the time evo¬ 
lution operator S(t) with (14.11), we see that i|>(x, L*o) is the 
kernel of S(t). We note that in the simpler cases (a free electron, 
an electron in homogeneous electric field, an oscillator) tj; (x, t\ x 0 ) 
can be found in closed form. 4 

The representation of operators in which functions (14.1) are 
the energy eigenfunctions (we assume that the Hamiltonian, H, 


* See, for instance, Chapter XIII in L. de Broglie, Einjiihrung in die Wei- 
lenmechanik, Akademische Verlagsgesellschaft M. B. H., Leipzig, 1929. 
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is time independent) plays a special role. This is due to its sim¬ 
plicity and also to its use in studying the radiation of atoms. Let 

H^ n (x) = EMx) (14.12) 

The solution of the wave equation (14.2), given the initial condi¬ 
tions (14.3), will obviously be 

*n(x, t) = e- iE n tlh $ n (x) (14.13) 

If we use these functions to express the matrix elements of a 
time-varying operator, we find that 

(n \L(t)\ n') = Lijv (x) dx (14.14) 

The first form of quantum mechanics, discovered by Heisenberg 
in 1925, was in this “matrix” formulation. Without introducing 
the operator concept Heisenberg associated with each physical 
quantity a matrix of type (14.14). We will call these Heisenberg’s 
matrices, and the representation in which the time dependence is 
shifted to the operators corresponding to physical quantities, the 
Heisenberg picture. 

Formula (14.13) suggests that a state with a definite energy is 
a stationary state. Indeed, (jc, ^) of (14.13) remains an eigen¬ 
function of the Hamiltonian for any instant of time, so that if 
at time zero the energy has a definite value, it will have the same 
value at subsequent times. This is another way of stating the law 
of conservation of energy. 

We will end this section with a note of historical interest. The 
first to consider the wave function y\>(x,y,z,t) was Louis de 
Broglie (1925). De Broglie introduced the idea of the associated 
wave for an electron and thus founded wave mechanics. Shortly 
after Schrodinger (1926) in a series of papers offered a mathemat¬ 
ical formulation for the problem of stationary atomic states. He 
showed that the problem reduces to finding the eigenfunctions 
and eigenvalues of a certain operator (the Hamiltonian). In the 
same year Schrodinger discovered that his own wave mechanical 
and Heisenberg’s matrix theories are mathematically equivalent. 
But it was only later that a satisfactory physical interpretation of 
the wave function was elaborated. 

15. Semiclassical approximation 

When Planck’s constant can be considered small in comparison 
with quantities of like dimensions encountered in a given problem, 
one can approximate the solution of the corresponding Schrodinger 
equation to a solution of the Hamilton-Jacobi equation of classical 
mechanics. 
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Let us consider the Schrodinger equation 

— + y> z)^=ih^ (15.1) 

where U is a given function of position, and let us seek its so¬ 
lution in the form 

■vj) = ty'e is,h (15.2) 

with t|/ a formal power series in A. Substituting (15.2) into (15.1) 
yields 

hb<gr.dS)>+t/ + ff]* 

= M [± (grad S • grad « + + 4*-] + -gp W (15.3) 

If we neglect the term proportional to A 2 and set the term not 
dependent on A equal to zero, we come to two equations 

(grad S) 2 U — — 0 (15.4) 

±- (grad S • grad ^) + = 0 (15 .5) 

In the second for t|/ we substituted its approximated value t|)°, 
which corresponds to A -*• 0. 

Equation (15.4) is the Hamilton-Jacobi equation of classical 
mechanics. Equation (15.5) can be transformed into the equation 
of continuity of classical hydrodynamics. This fact can be proved 
by multiplying Eq. (15.5) into 2t|>° and assuming that 

W>°) 2 = P (15.6) 

We obtain 

4- (grad S • grad p) + V 2 S + -§£■ = 0 (15.7) 

or 

div (-£- grad s) + -§f =0 (15.8) 

In classical mechanics grad S = p is the momentum vector 
and (l/m)grad S = v is the velocity. Hence Eq. (15.8) can be 
represented in the following form: 

div (pv) + -§f- = 0 (15.9) 

which is the equation of continuity. 

The solution to the Hamilton-Jacobi equation is commonly 
called the action function, which can be obtained by introducing 
the Lagrangian function 

SB 


( 15 . 10 ) 
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and evaluating the integral (the action integral) 


S = 



(15.11) 


along the particle’s trajectory. To calculate the action integral we 
first express the Lagrangian function in terms of time and the 
integration constants of which there are six, because for one parti¬ 
cle three second-order Lagrange equations are needed. Integration 
in (15.11) then yields an action integral S, in terms of time and 
the initial and final values of coordinates, that satisfies the Hamil- 
ton-Jacobi equation. 

The solution thus obtained is not unique. Other solutions depend 
not on the initial values of coordinates but on other constants of 
integration c\, c 2 , c 3 . Furthermore, instead of a rectangular Car¬ 
tesian coordinate system we can use some other kind of a system, 
but this ambiguity can be eliminated if we confine ourselves to a 
rectangular one. 

Let 


S = S(x, y, z, t, ci, c 2 , c 3 ) (15.12) 


be a solution to the Hamilton-Jacobi equation. From classical 
mechanics we know that 


dS dS _ dS _ 

dx ~ px ’ dy ~ p V’ dz ~ Pz ’ 


Ttf=~ H (15.13) 


where p x , p y , p z are the particle’s momentum components, and H 
is the total energy (the Hamiltonian function). Furthermore, the 
derivatives of S with respect to the constants Ci, c 2 , c 3 are new 
constants, which we denote b u b 2 , b 3 , so that we have 


dS , dS , dS , 

~d^~ b " ~d^~ b2 ’ ~d^- b * 


(15.14) 


In the special case when for Cj, c 2 , c 3 we take the initial values of 
coordinates, x 0 , y 0 , z 0 , the constants b\, b 2 , b 3 are the initial values 
of momentum components but with the opposite sign. 

Let us solve Eqs. (15.8) and (15.9) assuming that the solution 
(15.12) to Eq. (15.4) has already been found. Now we prove that 
for p we can take the determinant 


d 2 S 

d 2 S 

d 2 S 

dx dc\ 

dy dc, 

dz dci 

d 2 S 

d 2 S 

d 2 S 

dx dct 

dy dc 2 

dz dct 

d 2 S 

d 2 S 

d 2 S 

dx dc 3 

dy dc 3 

dz dc 3 



(15.15) 


(or, since p is positive, its absolute value). 
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We differentiate (15.4) with respect to the constants included 
in S, that is, c u c 2 , c 3 , to get 


1 dS d 2 S . 1 dS d 2 S . 1 dS d 2 S _ d 2 S 

m dx dx dc k ' m dy dy dc k ' m dz dz dc k dt dc k ' ‘ ' 

with k = 1, 2, 3. If we use the fact that 


J__dS_ 1 dS _ 

m dx ~ Vx ' m dy ~ V »' 


j_ as = 

m dz 2 


(15.17) 


we can rewrite Eqs. (15.16) as 

_JPS_ , d 2 S . d 2 S _ d 2 S 

Vx dx dc k Vy dy dc k Vz dz dc k dt dc k 


(15.18) 


(k = 1, 2, 3). The three equations can be solved for the “unknowns” 
v x , v y , v z , and the determinant of the coefficients of the “unknowns” 
is p [see Eq. (15.5)]. 

To simplify further formulas let us use the notation (15.14). 
Equations (15.18) transform into 


db u 


dx 


+ » 


db, 
» dy 


db t 

+ Vz ^r = 


db h 


dt 


(15.19) 


(these relations show that the b k are constant in time, which has 
been discussed before). The determinant p will then be 


db , 

db , 

db , 

dx 

dy 

dz 

db 2 

db t 

db 2 

dx 

dy 

dz 

db s 

db 3 

dbs 

dx 

dy 

dz 


and p y x , pv y , pv z will then be 



db\ 

db, 

db, 

dt 

dy 

dz 

dbt 

dbt 

db 2 

dt 

dy 

dz 

dbt 

dbt 

dbs 

dt 

dy 

dz 

dbi 

db, 

db] 

dx 

dt 

dz 

dbt 

dbt 

db 2 

dx 

dt 

dz 

db 3 

db 3 

db 3 

dx 

dt 

dz 


d (b i, b 2 , fra) 
a (*, y, z) 


a (&i, bt, b 3 ) 

d ( t, y, z) 


a (t>i, bt, b 3 ) 
a (x, t, z) 


(15.20) 


(15.21) 


(15.22) 
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db i db i dbi 
dx dy dt 
db 2 db? db 2 
dx di/ d/ 
d<>3 db^ dbi 
dx dy dt 

Substituting these into 

-k(Pt>x) +jf(pv «) + -§! bVz) + -£ (15.24) 

we find that (15.24) is identically zero. Therefore the equation of 
continuity (15.9) is satisfied. Consequently, Eq. (15.5) for the 
function t|)° is also satisfied, with tjj 0 related to p by (15.6). 

Let us illustrate this theory by the case of a mass point in free 
motion. Since in free motion velocity is constant and potential 
energy is zero, we obtain 

= (15.25) 

0 

(we have set / 0 = 0). For the constants of integration we will 
take the initial values x 0 > «/o, z 0 of the coordinates x, y, z. Then we 
will have 

x = x 0 + v x t, y = yo + v„t, z — z 0 + vj (15.26) 

and hence 

S = If K* - +(y~ W 2 + (z - zo) 2 ] (15.27) 




The determinant of the second derivatives of S, which are 
d 2 S m d 2 S m d 2 S m 

dxdx 0 ~ t' dydy 0 t ’ dzdz Q ~~ t 


(the second derivatives with respect to different coordinates are 
zero), will be a quantity inversely proportional to t 3 . This means 
that we can set 


constant 



p V* = in¬ 


constant 


(15.29) 


which implies that the approximate value of \j> is 


$ _ constant exp | [(* — x 0 f + (y — y 0 ) 2 + (z — z 0 ) 2 ]J (15.30) 


Substitution of this expression for t|) into the Schrodinger equa¬ 
tion shows that (15.30) is an exact solution and not an approxima¬ 
tion. (This can be seen without computation if we use Eq. (15.3) 
and keep in mind that at t|/ = t|)°, where tjj 0 is given by (15.29), 
V 2 !))' = 0.) 
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16. Relation between canonical transformation 
and the contact transformation of classical mechanics 

For systems that have a classical analog the canonical trans¬ 
formation of operators corresponds to the contact transformation 
of classical mechanics. 

Let < 7 n q 2 . q n and pi, p 2 , .... pn be the coordinates and 

momenta prior to transformation and Q u Q 2 , ..., Q n and 
Pi, P 2 , .... P n the coordinates and momenta after transformation. 
We consider the case when the transformation function depends 
on both old and new coordinates: 

S = S(q h .... q n \ Qi, Q n ) (16.1) 

A contact transformation is determined by a relationship between 
differentials, namely 


n 


n 


Zprdq r -ZPrdQ r = dS 

r -1 r=l 


which implies that 



Pr 


dS 

dQ r 


(16.2) 

(16.3) 


By solving Eqs. (16.3) one can determine q and p in terms of 
Q and P and the reverse. A solution to Eqs. (16.3) always exists 
since we assume that the determinant of the derivatives of 5 
does not vanish: 

«*«> 

What is the situation in quantum mechanics? The contact trans¬ 
formation of classical mechanics has corresponding to it in 
quantum mechanics a canonical transformation from a represen¬ 
tation in which q is “diagonal” to a representation in which Q is 
“diagonal”. The canonical transformation has the following form. 
For the sake of brevity 5 we will denote by ^(tf) the simulta¬ 
neous eigenfunctions of Q’s in terms of q's. 

We choose F as the operator we will transform. The kernel or 
matrix of the transformed operator F* will then be 

(O' in Q) = $ FVq (?) d Q ( 16 . 5 ) 


with dq the product of the differentials: 

dq — dq i dq 2 ... dq n 


5 We will often denote the totality of variables qi,...,q n with one symbol q. 
We use p, Q, P in a similar manner. 
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The eigenfunction can be considered the kernel (< 7 11Q) 

of a unitary operator U = U~ l , and we can write (16.5) as 

F m = UFU~ l (16.6) 

Formula (16.5) with F — 1 reduces to the orthogonality condition. 
Then on the left we will find the kernel of the identity operator 
in terms of Q, namely 

(Q'|i|Q) = MQ-Q')^6(Q,-Q0 ... t>(Q n -Q' n ) ( 16 - 7 > 

where 6 is Dirac’s delta function [Eq. (16.7) can be considered 
its definition]. 

In the semiclassical approximation, for 'Fq(< 7 ) we can take 

*«<»>-‘(I-4*?!)'*“* (16 - 8> 

which is a generalization of the expression obtained in the previous 
section. For the sake of brevity we have introduced the notation 

<■*■*> 

We also note that in (16.8) the expression in parenthesis is the 
absolute value of this determinant. The constant c in (16.8) is 

c = {2nh)~ nl2 (16.10) 

Let us check to see whether these functions approximately 
satisfy the orthogonality condition. If we substitute (16.8) into 
the integral (16.5), then for F— 1 the integrand contains a 
rapidly oscillating factor e'<s-s'>/*, where 5' is obtained from S 
by substituting Q' for Q. These oscillations cease only if Q' is 
close to Q. This condition is essential for the integral to be 
noticeably nonzero. For this reason we can replace S — S' in the 
exponent by 

s-s'—fcw-QJ-Sfr 06.11) 

r—1 

or 

S-S'=t(Q' r -Qr)P r (16.12) 

where P r is defined in (16.3). We can write (16.12) in a short 
form as 

S — S' — (Q' — Q)P (16.13) 

In all the factors of the exponential function we can set Q' — Q. 
We arrive at 

5 Vq (?) dq = c 2 5 e* <«'-«> p » | ^ | dq (16.14) 


6—2186 
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But if P, is defined by (16.3), the determinant in the integrand is 
the Jacobian of the transformation from variable P to variable q, 
so that 

dq^dPi ... dP n ^dP (16.15) 

This means that (16.14) can be written in the following form: 

5 W^iq) (q) dq = c 2 $ «» dP (16.16) 

But the right-hand side of (16.16) is simply the product of delta 
functions (16.7). This finally brings us to 

S V?) (?) dq = 6 0 (Q — Q') (16.17) 

which means that the orthonormality condition is satisfied. 

Now let us consider the matrix of an operator F, the operator 
in terms of q r and p r — — ih(,d/dq r ): 

F = F(q,p) = F(q, -//>-£•) (16.18) 

When such an operator acts on the exponential function e ls/H , the 
result in the considered approximation will be equal to the product 
of the exponential function and F(q, dS/dq ): 

F(q, — itl e ts/h « e ts,h F {q, -||) (16.19) 

This also holds for function (16.8). For this reason in (16.5) we 
can consider F not as a differential operator but as the function 
in the right-hand side of (16.19). Assuming, as before, that in 
the factors of the exponential function Q' equals Q, we have 

(Q'inQ) = c 2 $F(<7, (16.20) 

We take P for the variables of integration, as in (16.16). Func¬ 
tion F will then be transformed thus: 

F (q, p) = F(q (Q, P), p(Q, P)) = F* (Q, P) (16.21) 

where p and P are understood to be the classical expressions 
(16.3). Because of the approximate nature of formula (16.13) we 
can write 

(Q' | F* | Q) = c 2 $ F* (Q, P) e 1 «'-«> p ' h dP 


( 16 . 22 ) 
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To evaluate the integral we first note that multiplying the expo¬ 
nential function in the integrand by P is equivalent to applying 
operator — ih(d/dQ') to this function. Hence 

$r(Q, P)e‘<Q'-Q>m^= J F*(Q, -ih-£r)eiW-w" h dP (16.23) 

We can then take F* outside the integral sign and use (16.16) 
and (16.17) to get 

(O' I r |Q) = r ( Q, - ih-^r) 6o(Q - Q') (16.24) 

Here, as in previous formulas, for the first independent variable 
of F* we could have taken Q'. Since the result of applying F* to 
a function f(Q) is given by the formula 

f + (Q) = S (Q in O') ♦ (O') cLQ' (16.25) 

we will, by using the expression for the matrix element 
[Eq. (16.24)], obtain 

F*$ (Q) = F* (Q, (16.26) 

This will be the transformed version of F (to within terms inde¬ 
pendent of the sequence of factors in F). 

We can summarize thus. By using the approximate relation 
(16.19) we were able to pass from the operator F(q, — ih(d/dq)) 
to the function F(q,p), which was then expressed in terms of the 
new variables Q and P via classical formulas for contact trans¬ 
formation. From the new function F*(Q,P) we returned (when 
using differentiation with respect to a parameter for calculating 
the integral) to the operator F*(Q, — ih(d/dQ)). 

We have arrived at the following result. Consider the operator 

F — F (q, p) where p= — ih-!~ (16.27) 

which means that F is expressed in terms of a set of variables q. 
After a canonical transformation to a new set of variables Q 
the operator F changes to F*. By analogy with (16.27) let F* 
have the form 

F* = F* (Q, P) where P = -ih-^ (16.28) 

Now let us assume that the eigenfunctions used to effect the 
canonical transformation from q to Q in the semiclassical approx¬ 
imation are of form (16.8), so that their phase is S(q,Q)/h. In 
that case F* can be obtained from F only to within terms inde- 
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pendent of the sequence of factors 6 by a simple algebraic transfor¬ 
mation via 

F(q, p) = F*(Q, P) (16.29) 

dS n dS OA , 

P = P= =~JQ ( 16 ' 3 °) 

with 5 included in the phase of the unitary transformation. In 
classical mechanics these formulas represent the contact trans¬ 
formation. 

6 The difference of the terms that depend on the sequence of factors will 
tend to zero as A-*-0. 



Chapter IV 


THE PROBABILISTIC INTERPRETATION 
OF QUANTUM MECHANICS 


1. Mathematical expectation in the probability theory 

We first recall the concept of mathematical expectation known 
from the probability theory. Let a quantity k take on the values 

A.j, k^, .... k^, ... (1.1) 

whose probabilities are respectively 


Pit Pit •••« Pkt ••• (1*2) 

and the sum of the probabilities is unity: 

Pi + P 2 + •••■+■ Pk + ... = 1 (1.3) 

The mathematical expectation of a quantity is the sum of the 
products of the values of the quantity multiplied by the probabili¬ 
ties of its occurrence: 

M.E .k = Zp k k k (1.4) 

A 

where M. E. stands for “mathematical expectation”. 

Let us use a simple example to illustrate. Suppose we have N 
lottery tickets. Of these n\ win k\ rubles, n 2 win k 2 rubles, etc. If 
this is a lottery in which some holders lose and some win, one of 
the Vs can be zero. Obviously 


«I+«2+ ••• = W (1.5) 


and if we denote the sum of the winnings by A, 

n\k\ "f" «2^2 "h ••• “ ^ (1.6) 

The average winning per ticket (if we include nonwinnings) is 

A 


/ = 


N 


and the probability of winning kk rubles is 

n h 


Pk- 


‘ft 

N 


(1.7) 


(1.8) 
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If we substitute A defined by (1.6) into (1.7) and use (1.8), we 
can write the expression for the average winning per ticket as 

l = Zpkh ( 1 . 9 ) 

k 

By comparing this with the general formula (1.4) we see that 
in our example the mathematical expectation is the average win¬ 
ning per ticket: 

M. E.A, = / (1.10) 

In general, the mathematical expectation of a quantity is its mean 
value. 

The theory also deals with quantities that vary continuously. 
Let A take on, in addition to a discrete set of values, a continuous 
set in a certain interval. The probability of a value of A lying 
between A and A + dX will, generally speaking, be proportional 
to dA; we set it equal to 

p (A) dA (1.11) 

The sum of probabilities must, as before, be equal to unity; this 
condition can be written as 

5>*+$pWdA=l (1.12) 

k 

where integration is considered over the whole continuous set of 
values. At last, the mathematical expectation has the form 

M.E.A= £p*A*-f $ Ap(A)dA (1.13) 

k 

2. Mathematical expectation in quantum mechanics 

Let us now turn to the quantum theory. We have already dis¬ 
covered that an electronic state can be described by a wave 
function ty. We understood the description to mean that if \|? is an 
eigenfunction of the operator L corresponding to a physical quan¬ 
tity A, and A' is the respective eigenvalue, then specifying is 

equivalent to indicating that in measuring A we will get A = A'. 
An eigenvalue is expressed in terms of tne corresponding eigen¬ 
function as 

\ dx 

7-- (2-1) 

V dx 

So how is one to understand the description of a state by a 
function that is not, in general, an eigenfunction of a certain 
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operator L? We will answer this question by introducing a hypo¬ 
thesis about the probabilistic nature of such a description. 

Let us assume that we are dealing with a system of electrons 
all of which are in the same state i|>. If for each electron we meas¬ 
ure the quantity X, then in keeping with our hypothesis separate 
measurements may give different results (because of the influence 
of the measuring process on the object). The mean value obtained 
from these measurements, however, will be a definite number, 
which will represent the mathematical expectation of X in state ip. 
Hence our hypothesis leads to the assumption that the result of a 
separate measurement can be accidental but that for a large 
number of measurements the mean value does not depend on this 
number if it is great. Since, practically speaking, large numbers 
of electrons are involved in most cases, the mean value, or the 
mathematical expectation, of a quantity is even more accessible 
to measurement than the value of this quantity for a separate 
electron. 

We have thus given a physical interpretation of the concept of 
mathematical expectation. We will express it in terms of function 
ip, which characterizes the electronic state, and operator L, which 
characterizes the sought-for quantity. 

To start with, mathematical expectation must be an invariant, 
that is, it must not depend on the choice of independent variables 
in the wave function or on the representation of operators. In 
short it must be invariant under unitary transformations, con¬ 
sidered in the previous chapter. 

Aside from this, mathematical expectation must possess two 
properties known from the theory of probability. First, mathemat¬ 
ical expectation of the sum of two quantities must be equal to the 
sum of the mathematical expectations of these quantities, irre¬ 
spective of whether the quantities are interdependent or not. Se¬ 
cond, if in a given state the quantity X has a definite value X' the 
mathematical expectation must be X'. 

These conditions uniquely determine the expression for math¬ 
ematical expectation. Invariance implies that mathematical 
expectation must be expressed in terms of unitary invariants. 
These are the eigenvalues of operators and expressions of the 
type 

J ^ZaJ) dx, J ij)L 2 i|) dx, etc. (*) 

But we cannot interpret the eigenvalues of operators as mathe¬ 
matical expectations, if only because an eigenvalue of the sum of 
two operators is not, generally speaking, equal to the sum of two 
eigenvalues of the operators. We are thus left with expressions of 
type (*). From these we must choose the one to the left or a 
quantity proportional to it, since the first of the properties men- 
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tioned before requires that the mathematical expectation be linear 
in L. The second property gives the proportionality factor. Thus 7 

( $Lt|> dx 

M.E.L = - (2.2) 

V iptj) dx 

or, if ij) is normalized, 

M.E.L = J ij>Lt|> dx, J $i|> dx = 1 (2.3) 

These expressions satisfy all the stated requirements since, as we 
know, they are invariant under unitary transformations (the com¬ 
pleteness property of eigenfunctions) and also 

^ if (L + M) if dx = ^ ^Lij) dx + J dx (2.4) 

so that 

M.E. (L + Af) = M.E.L -f M.E.Af (2.5) 

Finally, if in state $ a quantity with an operator I takes on a 
value %, that is if 

L \|> — ta|) (2.6) 

then 

M.E.L = X (2.7) 

Hence this purely mathematical description has led us to a 
certain expression for the mathematical expectation of a quantity 
that is defined by an operator L and that characterizes the system 
in a state \|>. The example of the scattering of a-particles (the Ru¬ 
therford scattering law), to be considered at the end of Chapter V, 
Part II, shows that our theory agrees with experiment. 

3. The probability formula 

Equation (2.2) for mathematical expectation gives us a simple 
formula for the probability that in measuring a given quantity 
we will obtain a definite value or a value lying within certain 
limits. 

Let the i|>(x, A,) be the set of eigenfunctions of L. We expand 
ij)(x), which describes the electronic state, in a ip (jc. A,)-series: 

*to = £c(A*)'l>toA*) + $ eft)*(*,*)<«. (3.1) 

k 

The result of applying I to ^ is 

I* to - £ %kC {%k) * <*> **) + $ %c to + to *) & (3-2) 

k 


7 We will denote the quantity by the letter used for its operator. 




Basic Concepts 


89 


Assuming that \|) is normalized, we compose the expression for 
the mathematical expectation of X. From the completeness con¬ 
dition we have 

M.EA=^LiJ)dT = 2]|c(A, ft )pA ( *+ $X|c(X)P<a (3.3) 

If we compare this with (1.13), we find that the probability of X 
being equal to Xk is 

Pk = \c(X k )\* (3.4) 

and the probability of X lying within limits X and X + dX is 

p(X)dX = \c(X)?dX (3.5) 

The sum of the probabilities must be unity because due to rj) being 
normalized and the completeness condition we have 

5$iMt = £|c(**)P+ $|c(X)| 2 dX = l (3.6) 

k 

The expansion coefficients c(X*) and c(X) are the wave functions 
describing electronic states in the X-representation. Thus Eqs. (3.4) 
and (3.5) give a direct physical interpretation of the squared 
modulus of the wave function as the probability. For instance, let 
us set X to be the coordinates x,y,z. According to (3.5), the 
electron will be inside a volume with the boundaries 

(x, x + dx), (y,y + dy), (z,z + dz) (3.7) 

with a probability 

|t|)(x, y, z)fdxdydz (3.8) 

Now a general situation with X a physical quantity. When the 
initial state is given, to find the probability that the result of 
measuring X will be a specific value we must express in the X 
variables the wave function that defines this state [to put it 
differently we must find the coefficients c(X) in the expansion of 
ij)(x) in a tj? (x, X) -series]. The square of its modulus, that is |c(X) | 2 , 
gives the probability we are looking for. 

Let a state be characterized by a function <p(x, p) that is an 
eigenfunction of M corresponding to an eigenvalue p*: 

Af cp (at, |x ft ) = M.*qp (jc, jx*) (3.9) 

which means that in this state the measurement of p gives a 
definite value, p*. What is the probability then that in measuring 
another quantity, say X, we will get a value X*? If we apply 
Eq. (3.1) to (x) = cp (x, p*) and recall the formula for the 
expansion coefficients c(X), we find the sought expression 
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On the other hand, if we were to look for the probability that 
p = fu on condition that X = X n , we would come to the same 
result (3.10). Thus the probability that p = p* if X — X n is equal 
to the probability that X — X n if p = p*. 

If X and p are the same quantity, then tp and tp are the eigen¬ 
functions of one operator. According to the orthogonality of eigen¬ 
functions, at Xk X n the integral will vanish, and at Xk = X n it 
will be unity. This agrees with the physical meaning of (3.10) as 
probability. We see then that two orthogonal wave functions 
describe incompatible states. 

When the given quantity can change continuously, we cannot 
speak of the probability of it having a definite value — this pro¬ 
bability is zero. Instead we can speak of the probability that the 
quantity lies in a definite interval and also of the “probability 
density” that is, the probability divided by the width of the inter¬ 
val. For instance, 

N>(*. y, z,)P 

is the probability density for the coordinates. 

Here lies the difference between normalization of functions for 
the discrete and for the continuous spectrum. A transition from 
eigenfunctions to proper differentials corresponds to the transition 
from probability density to the probability that the quantity will 
lie within definite limits. 

4. Time dependence of mathematical expectation 

The mathematical expectation of a quantity with an operator L, 

\ dx 

M.E.L — -* 3-7 - (4.1) 

\ ij>i|> dx 

will, generally speaking, depend on time. If we choose a repre¬ 
sentation in which the position and momentum operators do not 
depend on time explicitly, i|> of (4.1) will satisfy the wave equation 

H*-ih^=0 (4.2) 

where H is the Hamiltonian. 

Let us first show that the integral in the denominator of (4.1), 
^ dx, will not depend on time. 8 We have 

7T \ ylpdx + dx 


* Also see (13.6) in Chapter III. 
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Using the wave equation yields 

-jt $ ^dx = y jj — dx 

But since H is a hermitian operator, the expression on the right 
is zero. Hence 

-£ifadx = 0 (4.3) 

If we set 

J ijn|> dx — 1 (4.4) 

at the initial moment of time (t = 0), the normalization (4.4) will 
remain constant for any t > 0. With this in mind we can sub¬ 
stitute (4.1) for a simpler formula, 

M.E.L= ^ dx (4.5) 

which also holds for any / > 0. 

Next we find the time derivative of the mathematical expectation 
of L. We interchange the derivative and integral signs and for 
d^/dt substitute the wave equation to get 

-ji^Ufdx=j $ H^L^dx+ ^-^(Ltydx 

which in turn yields 

-j i ^Utdx = -^ — i^4i) L ^ d x ( 4 - 6 

since H is hermitian. If we differentiate explicitly, we get 

-£ T \^U P dx=\i f [^- + -L(HL-LH)]M P dx (4.7) 

But the expression in brackets is only the operator for the time 
derivative of L: 

■&-T5-+ iWL-LH) (4.8) 

Thus Eq. (4.7) expresses the fact that 

-jf^Lqdx=\ij>^tydx (4.9) 

namely, that the time derivative of mathematical expectation 

equals the mathematical expectation of the time derivative, which 
is what it should be. If we take this as the starting point, we 
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come to the formula for the total time derivative of an operator, 
a result we arrived at earlier in a different way. 

In Part II, which is devoted to the Schrodinger equation, we 
will see that if for L we take operators corresponding to different 
mechanical quantities, the right-hand side of equations of the 
type (4.8) will look like the classical expression. Thus in quantum 
mechanics there exist the same relationships between mathematical 
expectations and their time derivatives as between quantities and 
their derivatives in classical mechanics. 

5. Correspondence between the theory of linear operators and the 
quantum theory 

Summing up this chapter and Part I, we can say that each 
quantum mechanical concept has corresponding to it a concept 
from the theory of linear operators, which implies that we can 
build a sort of lexicon that will enable us to translate mathema¬ 
tical terms into the language of physics. Roughly it will look like 
this: 


MATHEMATICS 

Linear operator L 

Eigenvalues X' (characteristic, or 
proper, values) 

Eigenfunction (characteristic, or 
proper, function) for eigenvalue X' 

Commutativity of operators 

Squared modulus of eigenfunc¬ 
tion, l4>l a 

Normalization ^|i|>| 2 dT = l 

Transition to proper differentials 
for a continuous spectrum 

Orthogonality ^ <p$ dx «= 0 

Completeness of the set of func¬ 
tions i|>(x, X') 

Integral ^ dx 

Squared modulus of coefficient in the 
expansion of q>(x) in a <p(x, X') -series 


PHYSICS 

Physical quantity X 

Observable values of the physical 
quantity 

Mechanical system is in the state 
with X = X' 

Possibility to observe physical 
quantities simultaneously 

Probability density 

Sum of probabilities is unity 

Inequality X' < X < X' + AX has a 
finite probability 

States q> and are incompatible 

Values X', X", etc. are the only 
possible ones 

Mathematical expectation of X in 
state q> 

Probability that X = V in state q> 


The possibility of such a comparison shows how closely the 
two theories are related and why the theory of linear operators is 
so essential to the quantum theory. 
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6. The concept of statistical ensemble 
in quantum mechanics 

In the first years of development of quantum mechanics, in the 
early attempts to find a statistical ( probabilistic ) interpretation, 
physicists were still bound by the notion of the electron being a 
classical mass point. Even when de Broglie’s idea on the wave 
nature of matter emerged, waves of matter were at times inter¬ 
preted as something that carries the mass points. Later, when 
Heisenberg’s relations appeared, they were interpreted as inac¬ 
curacy relations, and not as uncertainty relations. For instance, it 
was thought that the electron had definite position and velocity but 
that there was no possibility of determining either. The square 
of the modulus of the wave function was interpreted as proba¬ 
bility density for a particle — irrespective of the conditions of the 
actual experiment — to have given coordinates (the coordinates 
were thought of as definite). A similar interpretation was given 
to the square of the modulus of the wave function in momentum 
space. Both probabilities (in ordinary space and in momentum 
space) were considered simultaneously as the probability of a 
certain compound event, specifically that the particle has definite 
values of coordinates and momentum. The actual impossibility, 
expressed by Heisenberg’s relations, of their simultaneous meas¬ 
urement therefore appeared as a paradox or caprice of nature, 
according to which not everything existing is cognizable. 

All these difficulties vanish if we fully admit the dual wave- 
corpuscular nature of the electron, establish its essence, and grasp 
what the quantum mechanical probabilities refer to and what 
statistical ensemble they belong to. 

First, let us try to give a general definition of a statistical 
ensemble. We assume an unlimited set of elements having various 
features, which make it possible to sort these elements and to 
observe the frequency of occurrence of an element with a given 
feature. If for this there exists a definite probability (that is, for 
each element of the set), the set constitutes a statistical ensemble. 

In quantum mechanics, as in classical physics, the only sets 
that can be considered are those whose elements have definite 
values of the parameters (features) according to which sorting 
can be done. This implies that the elements of a statistical 
ensemble must be described in a classical language, and that a 
quantum object cannot be an element of a statistical ensemble 
even if a wave function can be ascribed to the object. 

The elements of statistical ensembles considered in quantum 
mechanics are not the micro-objects themselves but the results 
of experiments with them, a definite experimental arrangement 
corresponding to a definite ensemble. These results are described 
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classically and thus can serve as a basis for sorting the elements 
of the ensemble. Since for different quantities the probability 
distributions arising from a given wave function correspond to 
different experimental arrangements, they belong to different en¬ 
sembles. But the wave function cannot belong to any definite sta¬ 
tistical ensemble. This can be illustrated by the following diagram: 



E 

P 

X 

• • • 











$3 





• 






To each square in this diagram belongs a definite statistical 
ensemble with its own probability distribution for the result of 
measurement of a given quantity. A row includes the ensembles 
obtained by measuring various quantities ( E,p,x , ...), starting 
from one and the same initial state, whereas a column shows the 
ensembles obtained by measuring one definite quantity, starting 
from various states (fy, tp 2 , $ 3 ,...). 

The deeper reason for the wave function not corresponding to 
any statistical ensemble is that the concept of the wave function 
belongs to the potentially possible (that is, to experiments not yet 
performed, whose outcome and even type are not known). The 
concept of the statistical ensemble, on the other hand, belongs to 
the accomplished (to the results of experiments of a definite type 
already carried out). 

The probability of this or that behaviour of an object with a 
given initial state is determined by the internal properties of the 
object and by the nature of the external conditions; it is a number 
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characterizing the potential possibilities of this or that behaviour 
of the object. And the probability manifests itself in the frequency 
of occurrence of a given behaviour of the object; the relative fre¬ 
quency is its numerical measure. The probability thus belongs, in 
essence, to the individual object (and not to an ensemble of 
objects) and characterizes its potential possibilities. At the same 
time, to determine its numerical value from experiment one must 
have the statistics of the realization of these possibilities, so that 
the experiment must be repeated many times. It is clear from this 
that the probabilistic character of the quantum theory does not 
exclude the fact that the theory is based on the properties of an 
individual object. 

To summarize we can say that the purpose of the main concept 
of quantum mechanics, the concept of a state described by a wave 
function, is to analyze objectively all potential possibilities inher¬ 
ent in the micro-object. This determines the probabilistic nature 
of the theory. 




Part II 


SCHRODINGER’S THEORY 


Chapter I 


THE SCHRODINGER EQUATION. 

THE HARMONIC OSCILLATOR 

1. Equations of motion and the wave equation 

As we know, the wave equation, which gives the time depend¬ 
ence of wave functions, must have the following form: 

0-D 

with H the Hamiltonian. Generally speaking, different problems 
have different Hamiltonians. Schrodinger’s theory considers the 
case when the electron’s momentum is small compared to me , 
where c is the speed of light. This means that in the absence of a 
magnetic field the corrections imposed by the theory of relativity 
can be neglected, so that the electron is moving in an electric field 
or in a potential U(x,y,z). A generalization of the Schrodinger 
equation incorporating a magnetic field will be given in Part III. 

In Section 8, Chapter III, Part I, by analogy with classical me¬ 
chanics we wrote the following expression for the Hamiltonian 
operator: 

h= - 2h(pl+pl+pl)+ u (*’ y> *) 0 - 2 ) 

where the first term on the right-hand side stands for the kinetic- 
energy operator, and the second for the potential-energy operator. 
If for the operators p x , p y , p z we substitute their expressions, we 
come to the wave equation in the form 

-£r^ + U(x,y,z)*-ih -= 0 (1.3) 

Let us examine the equations of motion that follow from the 
Schrodinger equation. We find the velocity and acceleration op¬ 
erators. According to Eq. (13.22), Chapter III, Part I, 
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In the expression for H all terms except p 2 x /(2m) commute with x, 
and therefore 

w=2M(Pl x - x P*) 

= lPx (p* x — *Px) 4- (PxX — xp x ) p x ] 


But we know that 

[Px> x] = j (p x x — xp x ) = 1 


Hence 

dx 1 

dt m P* 

(1.4) 

and by analogy 

dy 1 dz 1 

dt ~m Py> ~dt m Pz 

(1.4*) 


Thus the velocity operator is the momentum operator divided 
by m, as we expected. Next we find the operator for dpjdt. We 
have 

*£± = j(Hp x -p x H) 

The only term in H that does not commute with p x is U(x,y,z). 
We then have 



By reasoning in the same way about the other two components in 
summary we get 

dp x _ dU dp y _ dU dp z _ dU /t cv 

dl dx ’ dt dy ’ dt dz ' ‘ ' 

Equations (1.4) and (1.5) coincide in form with the equations 
of classical mechanics. If we now recall the interdependence of 
the equations of motion and the law governing mathematical 
expectations, we get 

and two similar equations for y and z. These equations are called 
Ehrenfest’s equations. 


7—2186 
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2. Constants of the motion 

Now let us introduce the concept of the constants of the motion 
into quantum mechanics. It is customary in classical mechanics 
to call a constant of the motion a quantity (a function of coor¬ 
dinates and momenta) that remains constant under any initial 
conditions of the problem. In quantum mechanics we can set a 
constant of the motion to be a physical quantity whose mathemat¬ 
ical expectation remains, due to the wave equations, constant 
under any initial conditions. 

For an operator L to be a constant of the motion it is neces¬ 
sary and sufficient that, according to Eqs. (4.7) and (4.8). Chap¬ 
ter IV, Part I, the following conditions hold: 

in-^TF + i-WL-LH) = 0 (2.1) 

We can show that if L satisfies (2.1), its eigenfunctions, which 
are the solutions of the eigenvalue equation 

= ( 2 . 2 ) 

can be chosen so as at the same time to satisfy the wave equation 

= 0 (2.3) 

This will also be the case when the Hamiltonian contains the time 
variable explicitly. It follows that if at the initial moment (t = 0) 
the physical quantity L had a definite value X, it will have this 
value at any subsequent moment. 

If operator L does not contain time explicitly, condition (2.1) 
reduces to the operator’s commutativity with the Hamiltonian. 

Let us assume, for instance, that L = H and that H does not 
contain time explicitly. As we know (Section 13, Chapter III, 
Part I), in this case the energy conservation law holds, that is, 
a state with a given energy E remains the same for any time t. 
Equation (2.1) will then be 

//t|> = £H|) (2.4) 

The general solution to Eqs. (2.3) and (2.4) will have the follow¬ 
ing form: 

^ = i|)°(;t, y, z ; E)e~ lEt/h (2.5) 

We can use solutions of type (2.5) to build a solution satisfying 
the arbitrary initial conditions 

$ = f(x, y, z) at f = 0 


( 2 . 6 ) 
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For this we must expand the initial wave function (2.6) in a 
series of the eigenfunctions of the Hamiltonian: 

f (x, y, z) = Y j c ( E) i|>«» (x, y, 2 ; E) (2.7) 

E 

and then multiply each term by the corresponding exponential 
factor to get 

* = £ c (£) |><°> (x, y, 2 ; £) (2.8) 

E 

The series (2.8) will apparently be the general solution of the 
wave equation satisfying the initial conditions (2.6)’. 

Knowledge of the constants of the motion makes it easier to 
find the solution to the wave equation. Assume that H does not 
contain time explicitly and that there are two operators, L and M, 
that commute with H (which makes them constants of the motion) 
and, more than that, commute with each other. Then the equations 

//\J? == Ety t Lty — Xilp, Ahl) = |xi|) (2.9) 

will have common eigenfunctions. To find these we can start by 
solving the simpler of the three equations and then adjust the 
solution so that it satisfies the other two. We will use this method 
repeatedly in the future. 

3. The Schrodinger equation 
for the harmonic oscillator 

Let us consider the three-dimensional harmonic oscillator with 
the Hamiltonian 

H = 2m ( p l + p y + Pi) + ¥ m K* 2 + rf y2 + w 3 z2 ) (3 - 1 ) 

A model of this kind can represent a molecule with three vi¬ 
brational degrees of freedom. To find the eigenfunctions of the 
Hamiltonian we will use the method mentioned in the previous 
section and look for operators that will commute with each other 
and the Hamiltonian. These will obviously be 

HM= -^rPl+ 

Hi * ] = IF Pi + T mco 3 z2 (3-2) 


1 See (14.5) and (14.6), Chapter III, Part I. 
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To the equation 

= (3.3) 

we can join the three equations 

H ix) y = E {x) $, H iy) q = E (u> q, H {z) $ = E { % (3.4) 

where 

H = H ix) + H <y) + H (x) (3.5) 

and 

E = E w + E w + E ( * ) (3.6) 

Since all three equations in (3.4) are of the same type, it is 
sufficient to study only one, for instance, the first, which amounts 
to considering the one-dimensional oscillator. Thus for the x-com- 
ponent we have 

Pl + J mi °fx 2 ) $ = (3.7) 

or 

- -Hr-3F +1 m(0 ^= E ' x) * (3.7*) 

It is convenient to make a change of variables, 

Equation (3.7) then reads 

--^ + ^ = 2*4 (3.9) 

Finally we denote the solution of this equation by i$(£); the 
solution to Eq. (3.3) for the three-dimensional oscillator will 
then be 

*£ <*. *) = K (5.) x ^ (i 2 ) x < (i 3 ) (3- io) 

where ^ = x 2 m©,/ft, g 2 = y 2 mco 2 /fl, | 2 == z 2 m© 3 /ft, and, according 
to (3.6), 

E === ft (©i^,| "j“ ©2^2 “1" ©3^3) (3.11) 

Thus the problem reduces itself to an investigation of Eq. (3.9), 
which we will now undertake. 


4. The one-dimensional harmonic oscillator 
We start with the equation 


d 2 4> 


^2^ — 01 .h 


/A 
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Since the coefficients in (4.1) remain finite for finite g, the 
only singular points of this equation are g = ±oo. We must find 
solutions that remain finite at g= ±oo. Such solutions, we will 
see, exist only for specific values of parameter X; these values are 
the eigenvalues of the corresponding operator. 

To examine the behaviour of Eq. (4.1) as g-»- + oo we introduce 


a new function 


\ dty 

1 1 d\ 

(4.2) 

which turns (4.1) into 


-J|- 4- P = 4 2 — 2A, 

(4.3) 

We look for / in the form of a series: 


f = al + b + ^+ ... 

(4.4) 


If we now substitute (4.4) into (4.3), we get 

a 2 g 2 + 2abl + b 2 + 2ac + a + ... = g 2 -2A, 

Identifying the powers of g yields 

a 2 = 1, b = 0, (2c + \)a — — 2A, (4.5) 

The two values of a, namely a = ±1, give two possible solu¬ 
tions: 

/ = g-A±V* + ... 

/ = - g + i ^+ ... (4.4*) 

For the first solution, after integrating both sides of (4.8), we 
get 

In4 = -5- 4 2 — (^ + y) Ing + ... 
which implies that 

4, = e v/2 g-"“' /, (l+ ..•) (4.6) 

For the second solution 

^ 2 = e -5V2^-v, (j + ...) (4. 6 *) 

where we have left out the terms that decrease as |g| increases. 
For large positive values of g the general solution to (4.1) will be 

+ = V 7 ’ (1 + ...) + c 2 e- w t~' k (1 + ...) (4.7) 

and for large negative values 

+ = (1 + ...) + c*r vn £~' h (1+ .. •) (4.7*) 
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What we need is that t|) should remain finite as g-v-f 00 and 
g->—oo. This is possible only if 

c, = 0, c' = 0 (4.8) 

simultaneously. 

Hence the solution that interests us must have the following 
form: 

* = ,/2 F(%) (4.9) 

where F( g) at g = ± oo must be of the order of g x ~' /j . Let us find 
the differential equation for / 7 (g). Substituting (4.9) into (4.1) 
and dividing out the exponential factor, we get 

“jp— 2g + (2A, — 1) F = 0 (4.10) 

We look for the solution to (4.10) in the form of a power series 
in g. Since g = 0 is not a singular point, the series will include 
only positive powers of g, that is 

F=T J a k f (4.11) 

k-0 


which if substituted into (4.10) yields 

t k(k-\)a k t k ~*+ Z (-2fe + 2A-l)a,g* = 0 

k-Q k-0 

In the first sum the factor k(k — 1) vanishes if k = 0 or k — 1, 
and we can start the summation process from k = 2. If we then 
change k to k + 2, the new k will change from 0 to oo. The result 
is 

2 [(£ + 2) (k + 1) a* +2 + (— 2k + 2A, — 1) a*] g ft = 0 

ft-0 


For a power series to vanish all its coefficients must be zero. 
This gives 

2k — 2A, -f- 1 / ii i o\ 

a k+2 — (k + 2 )(k+ 1) Uk ( 4 - 12 ) 

This formula gives us a rule for successive determination of 
the a*, with the first two, a 0 and a\, being arbitrary constants. The 
series for F( g) will then be 


F(l) = a 0 (l 

+ «i 


(6 + 


1 -2A, 
1X2 
3 — 21 
2X3 


i 2 + 


l 3 + 


(I — 2A.) (5 — 2X) 

1X2X3X4 
(3 - 21) (7 - 21) 


2 X 3 X 4 X 5 


V + ...) 

| 5 + ...) (4.13) 


or 


/ 7 (i) = ao/ r o© + «iF 1 (l) 


( 4 . 14 ) 
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where F 0 and F i denote the corresponding series. As |g|->-oo the 
function /•'(g) must be of the order of |g| x-l/i for both positive 
and negative values of g. But 

Fi-D^aoFoiD-a.FAl) (4.15) 


Hence a 0 F 0 (l) and aif|(|) must each be of an order no higher 
than g x-l/j . Here two cases are possible: either F 0 and F\ are infi¬ 
nite series or one at least is terminated. In the first, both series 
converge for all g, since the ratio of two successive terms 


a k+2 l k+2 2fe — 2A. + 1 
a k t (* + 2)(*+l) § 


(4.16) 


tends to zero as k-*-oo. But the same ratio shows that starting 
from a definite term (for which k > \ — V 2 ) all terms will have 
the same sign. Hence the expansion will contain terms of the 
same sign with arbitrarily large powers of g, and its sum will 
grow faster than any finite power of g which contradicts our 
assumption (the order must not be greater than g x_,/j ). It follows 
that at least one of the series, F 0 or F it must terminate. This 
means that for a given k, for instance, k — n, the expansion coef¬ 
ficient a k+2 vanishes but a k is not zero. By the recursion relation 
(4.12) this will happen if 

X = n + n = 0, 1,2, ... (4.17) 

If n is even, F 0 ( g) becomes a polynomial. If we put a\ = 0, we 
get a solution that satisfies our assumption. On the other hand, 
for n odd it is /•’i(g) that becomes a polynomial, and we must set 
a 0 = 0. In both cases the solution will be a polynomial of degree n. 


5. Hermite polynomials 
Polynomials that satisfy the equation 


d*F 

rfg 2 


2l^ + 2nF = 0 


(5.1) 


with integral n are called the Hermite polynomials and are de¬ 
noted H n (l). The function (4.13) gives for them a representation 
in the form of a series; namely, for n even 


H n (l) = a 0 ( 1 
and for n odd 

Hn (I) = a i (l- 


2n 


1X2 


! 2 -f 


2 n ■ 


2X3 


! 3 + 


2 n (2 n - 4) 

1 X 2 X 3 X 4 

(In - 2) (In - 6) 

2 X 3 X 4 X 3 


I 5 - 


(5.2) 

(5.3) 
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The constants a 0 and a\ are usually defined in a way such that 
the leading coefficient is 2". This implies that 

“•=(-' !even M> 

< 5 - 5 > 


If we rearrange the polynomials H n (t) according to decreasing 
powers of £, we get an expression valid for both n even and n 
odd: 




n(n—\) (2^) n-2 


1 

_j_ n(n — l)(<t —2) (ft — 3) - 

1X2 


(5.6) 


Let us show that Hermite polynomials can be represented as 

H n ®-(-l) n <P-$re-V (5.7) 


First, it is easy to see that the right-hand side is a polynomial 
whose leading term is (2^)". Indeed, this term originates from 
the nth power of the derivative of the exponent —g 2 . Since Eq. (5.1) 
has only one solution in the form of a polynomial, what remains 
to be shown is that the right-hand side of (5.7) satisfies (5.1). For 
this we note that y = e~ v is the solution to 

y' + 2ly — 0 

Differentiating this equation n + 2 times yields 


or 

where 


yin+2) _j_ 2|t/W+D + (2 n + 2) = 0 

z" + 2|z' -f (2rt + 2) z = 0 


z = */<"> 

If we then introduce a new function 

w — e l ’z — g-t* 

we find that w is the solution to the equation 
w" — 2| w' + 2 nw — 0 


which coincides with Eq. (5.1) for Hence the representa¬ 

tion (5.7) is valid. 

Next we differentiate this equation, namely 


d*H n 0 dH n 
di 1 


+ 2 nH n = 0 


(5.8) 
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with respect to £ and get 

d 2 ft' dH' n 

-df-^-zt + {2 «- 2) k=° 

We have just proved that H' n (Q and H n - i(g) satisfy the same 
equation and differ only by a factor. Since the leading term in 
H' n (l) is 2n(2£) n -\ and in H n -i(l) it is (2g) n-1 , we have the 
following relation: 

-^—2 «//„_, (5.9) 

On the other hand, if we differentiate (5.7), we have 

*%JL = 2lH„-H n+l (5.10) 

Comparing the two expressions for the derivative, we come to a 
recursion relation that connects three successive Hermite poly¬ 
nomials: 

H n +\ — ^\H n + 2«//„_, = 0 (5.11) 

The functions 

'MS ) = c n e-i’ l2 H n (l) (5.12) 

are often referred to as the Hermite functions and are the eigen¬ 
functions of the operator in the left-hand side of 

-4f' + ^ b = (2«+1)^ (5.13) 

and thus are orthogonal: 

+ oo 

$ ’Ml)'MS) 4 = 0 at n^n' (5.14) 

— oo 

For them to be normalized we must determine the constant c n 
from the condition 

+ oo 

S ^(1)4=1 (5.15) 


] e-VH*( I)4 = -t (5-16) 

-oo Cfl 

Let us evaluate the last integral. Substituting (5.7) for //„(£), 
we have 
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Atter we integrate n times by parts, we get 


+<*> 


i-J 


d n H n 


e -V dl 

dl n 


But d n H n ldl n = 2 n nl, which implies that 


T w 

■j = 2 n n! J e~ v dl = n l ‘2 n n\ 


Consequently the functions 

(I) = 


n :/ * (2"«l) 


jre-^HAt) 


(5.17) 


(5.18) 


are orthonormalized. If we then solve (5.18) for H n ( g) and sub¬ 
stitute the result into (5.9) and (5.11), we get 


dtyn 
dl 

n \'h 


+ H>„ = (2rt) %-l 

n -f 1 \V« 


Then we see that substituting (5.20) into (5.19) yields 
difm f n \' , ‘ i ( n + 1 Y /s i 

« “(tJ ♦»-'-(—) ♦« 


Hn+I 


(5.19) 

(5.20) 

(5.21) 


In conclusion we will give without proof the asymptotic form 
of t()rt (g) for 2n — g 2 > 1: 

■'l’n(l) = — ^ n ^ ./ cos [Y» + —arc sin — 

^ b (2 n-l 2 ) 1 ' IA 2J (2 a)' 1 

+ ~L] (5.22) 


6. Canonical transformation as illustrated 
by the harmonic-oscillator problem 

When solving the oscillator problem, we took g for the independ¬ 
ent variable, thus expressing the state of the oscillator by a func¬ 
tion of g, namely ij)(g). Now let us take for the independent var¬ 
iable the quantum number n, which labels the oscillator’s energy 
levels. We expand tp (g) in a (complete) set of the energy eigen¬ 
functions: 

oo 

♦ (!)=£ c n^n (I) 
n=0 


( 6 . 1 ) 
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where the expansion coefficients are determined by simple inte¬ 
gration: 

+ °o 


Cn 




( 6 . 2 ) 


These coefficients, we know, can be interpreted as the wave func¬ 
tion in terms of re. 

The next step is to find the form of our two simplest operators 


1 = 


nm \'/j 

~b~) 


Xy 



in term of the new variable. We have 


1 

(math)' 12 P* 


(6.3) 


i'MS)=Z cni'Mi) 

n-0 

If we use (5.20) for and collect like terms, we get 

oo 

mi )=£ [(f ) /l c «-> + ( iL r 1 ) Vs c «+>] M 

n-0 

This means that the operator g transforms a function with 
expansion coefficients c n into a function with expansion coeffi¬ 
cients c' n , where 

= <= (y)' /J C n-l + C *+l ( 6 ‘ 5 ) 

(the symbol g is understood to be an operator). The last relation¬ 
ship can be written as 

c' n =Z(n\t\k)c k (6.6) 

k 

where 

(n\l\k) = (£)* t> n -i * + (^4 1 )' /J 6 n+1 (6.7) 

which implies that 

(retire- 1 )=(y)‘ / 2 ’ (re|||re+l) = (-^l) ,/ * (6.8) 

while all other terms in (6.6) vanish. Thus the position operator, 
g, can be expressed as a matrix with elements (6.7). Written out 
in full __ 

o VtT o o o ... 

VvT o yr o o ... 

o Vf o VvT o ... 

o o VvT o VsT ... 




(6.9) 
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We turn to the operator 



Reasoning along the same lines, we come to 

n-0 

then use (5.21) for d^Jdl and collect like terms. The result is 

p^== E [* (t) V Vi - '' ( ZL r 1 ) / ’ c n + l]% ( 6 - 10 ) 

n—0 

Thus the operator can be expressed as a matrix with elements 

(«I Pi I *) ■= * (t) 7 ’ 6 n-i k - 1 (-H -1 ) 7 * 6 »+* * < 6 - 1 

In full form this matrix is 

0 -/VvT 0 

p 5 = 'VvT 0_ V7T ... (6.12) 

o <V i o 


The Hamiltonian (in dimensionless units) 

H = i(Pl + t 2 ) (6.13) 

expressed in terms of the variable, must reduce to a multiplication 
operator. To verify this we first note that 

E c "{~+ T**♦») = E ( n + t) 

n —0 n =»0 

in virtue of the differential equation (5.13) for t|)„. It follows then 
that 

Hc n =(ti + ±.)c n (6.14) 

The matrix elements of H, which is obviously a diagonal ma¬ 
trix, are 

(«|ff !*)-(»(6.15) 
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We can obtain the same results if we square matrices g and p\. 
We have 

(n\l 2 \n') = Y J (n\l\k) (ft|g|«') 

x[( 4 -) Vl 6 *->«' + (- i T L ) ,/, ^ + >»'] 

= ±[n(n- l)] v *6„- 2 n’ + (n+ -i) 6 nn - 

+ 7 [(«+ l)(n + 2)]''‘6 n+2ll > (6.16) 

and similarly 

(“ i Pl I "')=- Z [(t)* s .«.] 

= — ~[tl{jl — 1 )] ll 6„_2 n' + (rt + y) &nn' 

-![(«+ 1) (« + 2)]' /j 6„ +2 (6.17) 

The half-sum of (6.16) and (6.17) is 

Y [(»Ii 2 1«') + («|p||«')] = («I H |«')= (« + y) Kn- (6-15*) 

which is what it should be. 

The commutation relation for g and p\, 

f(p^-g Pt )=l (6.18) 

must also hold for our matrices. Indeed, according to the rule of 
matrix multiplication, 

(-1 Pfi K) - j [-> (n - 1)1* 1»«, 

-{[(«+ I) fa 

(»| IP, | <0 - T I* (» - 1 )]■'• „- + y «... 

—|[fa+l)fa + 2)f’S„„„- 

Hence 

(n|/(p 4 g-gp 5 )ln')-6„„, 
which was what we set out to prove. 


(6.19) 
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7. Heisenberg’s uncertainty relations 

The mathematical expectation of a certain physical quantity L 
that is in the nth state of an oscillator is 

M.E.L=J^„d6 (7.1) 

This is obviously the diagonal element (n|L|n) of the matrix 
that corresponds to the operator L in the n-representation. It 
follows then that the mathematical expectations of the coordinate 
x=l(h/mm)'h and the momentum p x = pt(mJi<ti)'h are zero. 
Indeed, the respective diagonal elements vanish: 

x — 0, px — 0 (7.2) 


Let us now find the mathematical expectations of _the squares 
of the deviations of x and p x from the means x and p x ■ We have 

M. E- (*-*) 2 =M.E. * 2 = ^M.E.£ 2 



== —(n|| 2 |n) 

(7.3) 

so that, if 

we denote the left-hand side as (Ax) 2 , we get 



(A,)’ = M.E .Oc-Tf-^-tn + X) 

(7.4) 

Similarly 


M.E. (P x -P x f= M. E. pl = mh(o(n\pl\n) 

(7.5) 

or 


(&p x ) 2 = mfUi>(n + - j) 

(7.6) 

Thus 


+j)r 

(7.7) 


Ap* = [m/zco (n +y)] /> 

(7.8) 

and we come to the final result 



A p x Ax=(n + Y)^ 

(7.9) 


We can interpret the two quantities, A p x and Ax, as the root- 
mean-square, or standard, deviations of the measured values of 
the oscillator’s momentum and position from the mathematical 
expectations of these quantities. 

What we have proved is a very general result. If we understand 
(7.9) as a relationship between the orders of magnitude of the 
standard deviations and if we introduce a quantum number n in 
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the proper way, the formula will be valid not only for an oscil¬ 
lator but for any system in its nth state. The product of the stand¬ 
ard deviations will be minimal in the ground state (n — 0), which 
implies that the following inequality also holds: 

Ap x Ax > ti/2 (7.10) 


We will now prove that this inequality holds not only for an 
electron in one of the oscillator’s eigenstates but for an electron 
in any state. To this end we will take the mathematical expecta¬ 
tions of x and p x equal to zero (this restriction can be easily 
lifted). 

Let the electron be in a certain state ij)(x). We introduce two 
real constants a and p and consider the inequality 


-f oo 


S H+p-S 


dx^O 


(7.11) 


which holds for all values of a and p. Evaluating the square of 
the modulus in the integrand, we come to 

a>\x*ndx + ap\x(^$ + ^)dx + p\^^dx>0 


(7-12) 

or 

zla 2 -Bap-fCP 2 >0 (7.13) 

where 

A = ^ Jc 2 ijnj> dx 

~\ x Tx (W) dx = ^ dx 

For the quadratic form (7.13) to be positive it is necessary that 
4 AC ^ B 2 , or 

A' l, C' h > B/2 (7.15) 

since A, B and C are all positive. But according to (7.14) we have 
A' u = Ax, C' h = Ap x /h, B= 1 (7.16) 

whence 

Ap x Ax ^ til 2 


which is what we set out to prove. 

Similar inequalities evidently hold for the other two coordinates, 
so that in summary we have 

Ap x Ax^hl2, ApyAy^H/2, Ap z Az~^hl2 (7.17) 
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The inequalities (7.17) were first stated by Heisenberg, who in 
a number of examples from physics showed how an increase in 
the precision of measuring the position leads to a decrease in the 
precision of measuring the momentum, and vice versa. We note 
that the above formal derivation of (7.17) belongs to Hermann 
Weyl. 

8. The time dependence of matrices. 

A comparison with the classical theory 

The canonical transformation of Section 6 did not contain time 
and thus gave a representation in which the mathematical form 
of operators does not depend on time (see Section 13, Chapter III, 
Part I). We now turn to a representation in which the time de¬ 
pendence is, so to say, shifted to the operators themselves. For 
this we must find a unitary operator S(t) such that will trans¬ 
form the initial (t = 0) state i}> into a state at time t > 0: 

^ (x, t) — S (/) ip (x, 0) (8.1) 

Then an operator L as a function of time will be represented as 

L'(t) = S + (t)LS(t) (8.2) 

If for the independent variable we take the energy, S(t) as¬ 
sumes the simplest form. In this variable the wave equation 

Hty — ih = 0 

reads 

Hc n -ih^- = 0 (8.3) 

or 

fUo [n + j)c n -in^r = 0 (8.4) 

(Here we have shifted from the dimensionless units of Section 6 
to conventional units.) The solution to this equation is 

c n = c Q n exp [—/ (« +'/ 2 ) coO (8.5) 

Thus the application of S(t) reduces to multiplying c® by the 
exponential exp [— i(n -f- V 2 )co/]. This implies that S(t) can be 
represented by a diagonal matrix 

e -mi2 0 0 ... 

S(0 = 0 e ~ Zlam °- (8.6) 

0 0 e~ 5iu> tl 2 
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with matrix elements 

(n 15 (I) | n') = exp [— i (n + 'k) a>t] f> nn ■ (8.7) 

Using (8.2) to calculate Heisenberg’s matrix x(t), we get 

(«l*(OK) 


- [(is )"'*■- •' + (tStT ■']*- 


Hn'+'!,)<s>t 


or 


(«U (/)!«') 


- e ‘“ (^rf s -'»' +( £L £jr L )''' 5 *« »• («•*) 


which leads to the following matrix: 


0 


x(0 = 


» V 

2m» / 


0 



(8.9) 


In the same fashion 

Mr. (0 I/O - (^f »*-"■' - ( ■ . "-‘ .: '„- w ’ ) 'a„„ „. 


and 


Px (0 = 


( 8 . 10 ) 


o -« 

. / Bmw V/i <u< 


. / tel y/i fa* 




m’ 


e'~' 0 — i (hm g>)'^ ... 

i (timat)' 1 * 0 ... 


( 8 . 11 ) 

If according to (8.2) we were to form the operator 

H'(t) = S + ( t) HS (t) 


we would discover that its matrix coincides with that for // and 
hence does not depend on time. This is a logical result, since the 
oscillator’s energy remains constant. 

It can easily be shown that Heisenberg’s matrices satisfy the 
following equations of motion: 


dx _1_ 

dt m Px ’ 


dpx 

dt 


— nun 2 x 


( 8 . 12 ) 


8—2186 
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where dx/dt and dp x /dt are matrices whose elements are deriva¬ 
tives of the matrix elements of x(t) and px{t). Equations (8.12) 
coincide in form with the classical equations. 

The matrix elements of x(t) and p x {t) resemble the terms in the 
Fourier series for the corresponding classical quantities. To study 
this analogy more closely let us see which quantity plays the role 
of the classical amplitude. 

When the oscillator is in state n, the probability that coordinate 
g has a value lying between g and g + dg is expressed by 

11, © i ! rfi - ® “S < 8I3 > 

The asymptotic form of \M£)> (5.22), shows that for large 
values of n the function rMg) is approximately a sinusoid when g 
changes in the interval between —(2n) v * and (2n)' l >, with the 
polynomial //„(g) vanishing exactly p times. Outside this interval 
'Mi) begins to decrease rapidly because of the predominance of 
the exponential factor. It follows then that the probability density 
is noticeably nonzero only in the interval —(2n) ,/i < g < (2 n)'t\ 
which implies that 

go = (2rt) V * (8.14) 

can be considered the “amplitude” of the oscillator. In conven¬ 
tional units the amplitude is 

*-(£)' < 8 - l5 > 

On the other hand, the energy of the oscillator is 

E — E n = (n y) ~ titua (8.16) 

Solving (8.15) for n and substituting the result into (8.16), we 
get 

Ett j nwFxl (8.17) 

Hence the relation between amplitude and energy is the same as 
in the classical theory. If we compare the matrix elements of 
x(t) with the amplitude (8.15), we find that 

(n| x(t) I n — 1) + (n — 1 lx (0 In) = ( 2 ^) + e~ la>t ) 

( 2nh \'l> , 

— I-| cos co/ 

V me 0 / 

= Xq cos (at (8.18) 

Thus the matrix elements of x(t) that are closest to the nth 
diagonal element give the terms in the Fourier series that repre- 
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sents the classical quantity *(/) in the nth state (that is, in a 
state with E = E n ) ; here the values of n are considered large. 
This formal analogy between the terms in a Fourier series and the 
elements of a matrix representing a quantum operators served 
Heisenberg as a starting point 'for the creation of quantum me¬ 
chanics, which in its earliest version was called matrix mechanics, 
as we know. 

9. An elementary criterion for the applicability 
of the formulas of classical mechanics 

When in Section 15, Chapter III, we considered the semiclas- 
sical approximation to the solution of the Schrddinger equation, 
we found the approximate expression of the wave function, \|), in 
terms of the action integral S. Let us now apply the results of 
that section to the stationary state. 

The Schrodinger equation in this case is 

V 2 i|> + ^-(£-t/)i|> = 0 (9.1) 

and the Hamilton-Jacobi equation of classical mechanics reads 

~(gradS) 2 + U = E (9.2) 

If S is the general solution to Eq. (9.2), a solution that contains 
three arbitrary constants c u c 2 , and c 3 (including the constant 
energy E but not the additive term), then depending on the boun¬ 
dary conditions we can put 

<M> 

or 

* : = | ( det [ | cos (f + a ) < 9 - 4 > 

where the expression in parentheses is the determinant of the 
second derivatives of S, and a is a constant phase. The transition 
from the equation of wave mechanics, (9.1), to the equation of 
classical mechanics, (9.2), is formally analogous to the transfor¬ 
mation from wave optics to geometrical optics. We can use the 
language of wave optics (or wave mechanics) to formulate the 
conditions under which the approximate formulas (9.3) and (9.4) 
can be applied: 

The relative change in the index of refraction (or wavelength) 
over a distance of one wavelength must be considerably smaller 
than unity. 


8* 
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If instead of the wavelength X we use XI ( 2jx) as the characteristic 
length, this condition can be written as 


X v 1 grad X | _ 
2rt A X ~~ 


grad 


X 

2 it 


< 1 


In quantum mechanics X is the de Broglie wavelength: 

X — 2nfi 

(2 m (E - U) J v * 


(9.5) 


(9.6) 


But since the domain we are considering lies between classical 
mechanics and quantum mechanics, the criterion for applying (9.3) 
or (9.4) can be stated in the language of classical mechanics as 
well. Indeed, if we substitute the expression for X, (9.6), into 
condition (9.5), we get 


mb 

[2m (E - U) l v * 


| gradt/ |< 1 


(9.7) 


We introduce the absolute value of the particle’s velocity, v, 
and the absolute value of its acceleration, w, and write 


Hence (9.7) gives 
or 


[2m (E — U)]' 1 ’ = mv 

(9.8) 

1 grad U \ — mw 

(9.9) 

£«' 

(9.10) 

t£»' 

(9.11) 


This is the very criterion that we were looking for. Apart 
from h it includes only quantities of classical mechanics; namely, 
kinematic quantities and the particle’s mass. 

Note that, according to the well-known formula of kinematics, 
we have 

■’-(*)*+( t)’ < 9 - 12 > 

where v is the absolute value of the velocity, and p is the path’s 
radius of curvature. This implies that 

(9.13) 


which, after being substituted into (9.10), yields 


b 

mv p 


< 


1 


(9.14) 


X 

2;tp 


<1 


(9.15) 


or 




Schrodlnger’s Theory 


117 


where X is again the de Broglie wavelength. Thus the de Broglie 
wavelength must be considerably shorter than the path’s radius 
of curvature. 

The criterion expressed by formulas (9.10) and (9.11) can be 
applied in two different ways. First, if we consider velocity and 
acceleration as functions of position coordinates [Eqs. (9.8) and 
(9.9)], then (9.3) or (9.4) will give a good approximation to 
Schrodinger’s wave function. Second, instead ol velocity and 
acceleration we can introduce their mean values into (9.13). The 
left-hand side will then become a certain constant whose order 
of magnitude will, when compared to unity, characterize the 
applicability of classical equations. 

In the initial stages of development of quantum mechanics 
Bohr formulated his famous correspondence principle. According 
to this principle, for large quantum numbers the formulas of 
quantum mechanics must transform into classical formulas. We 
can then expect that the above-mentioned parameter [the left-hand 
side of (9.11)] will be connected with the quantum number char¬ 
acteristic of the given problem. 

We will now show, using a simple example, that this is indeed 
the case. Let us consider the one-dimensional harmonic oscillator. 
Here the velocity is parallel to the acceleration. We will use the 
root-mean-square values of velocity and acceleration as the pa¬ 
rameters. We have 

x —a cos ad (9.16) 

and hence 

t> 2 = r* = Y aV, cy 2 = p = -i- a 2 © 4 (9-17) 


This implies that 


wo* _ wa*ia 
fi w 2B 


(9.18) 


The oscillator’s energy is expressed in terms of the amplitude as 

£ = }maV (9.19) 


and hence (9.18) reads 


ma 2 a> E 

2 b ha> 


(9.20) 


Thus for the one-dimensional oscillator criterion (9.11) takes on 
the following form: 


mo 3 

bw 



(9.21) 


But according to (8.16) the energy of the oscillator is 
£«=£„=(/! 4 -1)b<b 


(9.22) 
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where ra is the quantum number of the oscillator in a given state. 
This implies that (9.21) demands that n be large compared to 
unity. 

In summary we may say that the applicability of equations of 
the classical theory does not necessarily mean the applicability 
of classical concepts. The difference between the “probabilistic” 
way of describing phenomena, using a wave function, and the 
“absolute” way, using classical quantities, (this difference was 
mentioned in Part I) remains valid as well when classical quan¬ 
tities provide a good approximation to the wave function. 




Chapter II 


PERTURBATION THEORY 


1. Statement of the problem 

Only in a limited number of the simplest cases is it possible to 
solve the Schrodinger equation and find the eigenfunctions of the 
Hamiltonian, as well as of other operators. Solution of more 
complicated problems, say, the many-body problem, requires 
approximate, that is, essentially new, methods (for instance, the 
variational method). The many-electron problem will be studied 
in Part IV. Here we will consider the case when the process of 
solving a problem can be divided into two steps. The first step 
is to simplify the problem and then solve it exactly. The second is 
to calculate those corrections to the simplified problem that enable 
us to estimate the influence of the small terms omitted in the 
first step. There is a general method for calculating the corrections 
called perturbation theory, which we will develop in this chapter. 

Let us assume that there exists an operator (for definiteness we 
consider it to be the Hamiltonian) that can be represented as a 
sum of two terms: 

H — H° eU (1.1) 

The first term, H°, is the Hamiltonian of the unperturbed prob¬ 
lem, or the zeroth-order Hamiltonian, and the second, tU, is the 
correction, or the perturbation Hamiltonian, which we can regard 
as “small” (for the sake of convenience we write the correction 
as the product of the smallness parameter, e, by the operator, U). 
We will consider the perturbation Hamiltonian to be such that 
when e is reduced to zero, the eigenvalues and eigenfunctions of 
H continuously transform to the eigenvalues and eigenfunctions 
of H°. 

This condition in some cases is not satisfied, and the pertur¬ 
bation changes the type of solution, introducing a continuous spec¬ 
trum, for instance. The formal solution of the problem in such 
cases, however, has a physical meaning. The solution provides a 
wave function that describes the state of an atom that is not 
exactly stationary but can be considered quasi-stationary. But 
what is such a state? If we assume the obtained wave function to 
represent some initial state of an atom, then for a long period the 
state will but slightly differ from the initial one. The theory of 
quasi-stationary states will be discussed in Section 8, Chapter III. 


119 



120 


Fundamentals of Quantum Mechanics 


We must bear in mind that the perturbation theory series can 
diverge, which does not, however, deprive it of physical meaning. 
This is so if the first terms in the series decrease relatively fast. 
In this case we use only a finite number of terms. 

To return to our problem let us assume that the eigenfunctions 
y°n(x) and eigenvalues of H° are known exactly, that is, we 
have found the solutions to 

(*) = £?,if°„(x) (1.2) 

We need to find the approximate expressions for the eigenfunctions 
and eigenvalues of H, that is, solve the equation 

(//° + ef/)iM*) = £»*»(*) (1.3) 

To solve the posed problem we must first solve the nonhomo- 
geneous equation 

//o-ip - £> = / (1.4) 

for the case when the parameter E' is one of the eigenvalues of H°. 
Depending on whether the eigenvalues of H° are degenerate or 
nondegenerate, the solutions of both the preliminary problem and 
the general problem differ. To clarify the idea of the method we 
will first consider nondegenerate eigenvalues. After that we will 
generalize the results and apply them to the degenerate case. 

2. Solution of the nonhomogeneous equation 

Let us consider the nonhomogeneous equation 

// 0 if-£'if = / (2.1) 

where f is a known function, and if is the sought function. Assume 
E' to be one of the eigenvalues £^ of H°. Equation (2.1) then has 
the form 

/Af-£°„if = f (2.2) 

First we consider the eigenvalue £^ to be nondegenerate, which 
means that the corresponding homogeneous equation 

H\ - £^if = 0 (2.3) 

has only one solution, if = if“. 

We expand the unknown function, f, in a if°-series: 

f = Z a m*°m+\ a (WB dE 

m 


(2.4) 
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and look for the solution of (2.2) in the form of a similar series: 

*=Z C m 'i’ 0 m+ (2>5) 

m 

which, after substituting (2.4) and (2.5) into (2.2), yields 
X c - ~ p ») < + \ c < £ H £ - dE 

m 

= Yj a ^m + 5 a (E) ty° E dE (2.6) 

m 

In the left-hand side of this equation the coefficient ofi|)° is 
zero. For the equation to have a (nontrivial) solution the coeffi¬ 
cient of the corresponding term in the right-hand side must also 
vanish, that is 

a„ = 0 (2.7) 

which can be rewritten as 

$$/dT = 0 (2.8) 

Thus we conclude that 


The nonhomogeneous equation (2.2) has a solution if its right- 
hand side is orthogonal to the solution of the corresponding 
homogeneous equation. 


If this is so, the other coefficients c m and c(E) can be found by 
equating the corresponding terms on both sides of (2.6). This 
yields 



c(E) 


a(E) 

E-El 


(2.9) 


Expansion (2.5) then reads 




a IE) 
E-El 


$%dE 


( 2 . 10 ) 


where the prime on the summation sign means that the term with 
m — n must be omitted. 

To this expression obviously we can add a solution of the homo¬ 
geneous equation, c^, where c is an arbitrary constant. 

If in Eq. (2.1) E' had not been equal to any E° n , there would be 
no need to impose restrictions of type (2.8) on f, and the solution 
would have been 


m 



a(E) 

E-E' 


%dE 


where m runs through all values. 


( 2 . 11 ) 
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Now, let us turn to the degenerate case. As usual, we will 
consider E°o, E°i, E\, .. . to be different eigenvalues, so that their 
degeneracy will be expressed in the fact that each of them (say 
E° n ) can have several (say s) eigenfunctions, which we denote 

'Kv ^ 2 . •••. ( 2 - 12 ) 

with s depending on n in the general case. We note that the eigen¬ 
values corresponding to a continuous spectrum can also be degen¬ 
erate. For the sake of simplicity, however, we will write our equa¬ 
tions as though these eigenvalues were nondegenerate. 

Let us assume that the homogeneous equation 

°„=0 (2.3) 

has s solutions (2.12) and we must find the solutions for the non- 
homogeneous equation 

/A|>-£°„i]) = / (2.2) 

The expansions for the given function f and the unknown func¬ 
tion if are then 

S 

/ = £ E + \ a ( £ ) dE ( 2 - 13 ) 

m —1 

S 

♦ I + (C)r e dE (2.14) 

m r — 1 

Substituting these into (2.2), we get 

S 

Z Z + S ( £ - E °.) c <« *•« “b 

m r— 1 

5 

-ZZ°-X, + S»W’fi^ (2-15) 

m r —1 

From this we conclude that 


flni = a« 2 = ••• =a ns = 0 (2.16) 

that is, f must satisfy s conditions 

\^rfdx = 0, r=l,2, ...,s (2.17) 

Thus we see that in the degenerate case 

The nonhomogeneous equation (2.2) has a solution if its right- 
hand side is orthogonal to each solution of the corresponding 
homogeneous equation. 
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After we have determined all the coefficients c m and c{E), we 
get the following expression for 1 ( 5 : 

*“£'ITT? E < 2 - 18 ) 

m m n r _i n 

We see that for the degenerate and the nondegenerate case the 
equations are quite analogous and the criterion for the existence 
of a solution is formulated in almost the same way in both cases. 

3. Nondegenerate eigenvalues 

Let us now return to our main problem, namely, the solution of 

(H° + eU)q n = EA n ( 3 . 1 ) 

and study the case when all the eigenvalues of H° are nonde¬ 
generate. 

We look for the eigenvalue E n and the eigenfunction in the 
form of expansions in powers of the smallness parameter e: 


En = E°n + eEn + Je" + ... (3.2) 

^ = < + < + + ••• ( 3 - 3 ) 

If we substitute these into Eq. (3.1) and identify powers of e, 
we get a series of equations 

- E°X = 0 (3.4a) 

H%-E°X=-UV n + E'X (3.4b) 

- £°„< = + E'X + E'X (3.4c) 


The first one is satisfied automatically since, according to the 
initial assumption, ty° n is an eigenfunction of H° corresponding to 
E a n . The second equation, (3.4b), is a nonhomogeneous equation 
for determining As we know, for this equation to have a 
solution its right-hand side must be orthogonal to the solution of 
the corresponding homogeneous equation, Using the fact 

that is normalized, we can write this condition as 

E' n = (n\U\n) (3.5) 

where 

(n\U\n)=\^ n Ur n dx (3.6) 

is the diagonal element of the matrix for the perturbation U. Thus 
the orthogonality condition makes it possible to determine the 





124 


Fundamentals of Quantum Mechanics 


unknown constant £'. The next step in solving (3.4b) is to expand 
its right-hand side in terms of the functions ^ and 

f - Wn + ZVn 


= - (ml U In) r m - \ (E\u\n) dE (3.7) 

m 

where 

(m\U\n)=\^ m U^ n dx (3.8) 

(E\U\n)-\ty B UV n dx (3.8*) 


The solution ij>' of Eq. (3.4b) can be obtained by using the 
formulas of the previous section, namely 

< = E' < + S n dE (3.9) 


(We did not add the term ci|)° because t|>' is obviously orthogo¬ 
nal to Hence 

*!>„ = ^ ( 3 * 10 ) 


which is the approximate solution to the perturbation problem, will 
be normalized to within terms of the order of e*.) 

Let us now turn to the third equation, (3.4c). First, we must 
determine the constant E" from the condition that the equation 
has a solution. This condition yields 

- K)%dz (3.11) 


If we then substitute expansion (3.9) for ij)', from the com¬ 
pleteness property of eigenfunctions we get 



. f 

F° — P° * J 
c n n m 


dE 

E° n -E 


(3.12) 


After this we could find the correction and the third, fourth, 
etc. approximations. The calculations proceed in a similar man¬ 
ner, notably, after we find 


<.*>?-“ and E° n ,E' n , £<*-» 


the condition for the existence of a solution to the feth equation 
yields E { * ] and then The formulas become more and more 
involved, but it is usually sufficient to take the first-order approx¬ 
imation for the eigenfunction and the second-order approximation 




Schrddlnger’s Theory 


125 


for the eigenvalue. When E' n , the first-order correction for the 
eigenvalue E n , is nonzero, we can limit ourselves to this if there 
is no need for a higher accuracy. 


4. Degenerate eigenvalues. Expansion 
in powers of the smallness parameter 

Let us look for the equation 

(H° + eU)^ n = E n ^ n (4.1) 

for the case when there is degeneracy , that is. eigenvalues of H° 
are degenerate. We choose the unperturbed equation 

0 (4.2) 

to have s solutions 

^2* •••> (4.3) 

We know (Section 7, Chapter II, Part I) that one can select 
the s solutions to a certain extent arbitrarily, which implies that 
by performing a unitary transformation we can replace the 
functions (4.3) by a linear combination of them. For what follows 
it is convenient to construe the as solutions chosen in a 
specific way depending on U. The original s solutions (whatever 
they may be), whose linear combinations are 4p° nr , we will denote 

<Pi. <P2.<P* (4.4) 

where we omit the subscript n though keeping it in mind. 

We know from algebra that in an infinitesimal change of the 
coefficients of an algebraic equation a multiple root can separate 
into several simple roots. Here too by analogy we can expect a 
degenerate eigenvalue E°„ to separate into s nondegenerate 
eigenvalues £„,(/• = 1, 2,..., s) owing to the perturbation. For 
this reason we will seek the eigenvalues in the form 

E nr = £° rt + eEn r + ^ E" r + ... (4.5) 

where the corrections depend on the number r of the correspond¬ 
ing eigenfunction. This last can be written as 


= €r + + e X + 


(4.6) 
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By substituting (4.5) and (4.6) into (4.1) (where we must change 
E n to E n , and tf>„ to \jw) and identifying powers of e we get a series 
of equations 

H°r n r-E° n i>l = 0 (4.7a) 

- BMr = - U €r + E' n X r (4-7b) 

*°C - E°n<r = - Ul'nr + KXr + Ok iW 


These differ from the analogous equations (3.4a) — (3.4c) only in 
the addition of a second subscript to the eigenvalues and eigen¬ 
functions. The first equation, (4.7a), is satisfied automatically. 
For the second equation, (4.7b), to have a solution its right-hand 
side must be orthogonal to each solution of the homogeneous equa¬ 
tion: 

\Vn P (U-E' nr )r nr dT = 0, p= 1, 2, .... s (4.8) 

or, which is the same, 

\<P° p (U-E' nr )r nr dr = 0, p=l, 2, ...,s (4.9) 

Equations (4.8) and (4.9) are not generally satisfied by arbi¬ 
trary solutions of (4.2). So we must find combinations 

V°nr= =b lr ( Pl + b 2r ( ?2+ ■■■ + b sr% ( 4 - 10 ) 

of the known solutions (4.4) that will satisfy (4.8) and (4.9). 

5. The eigenfunctions in the zeroth-order 
approximation 

Let us find the coefficients in the unitary transformation (4.10). 
If we substitute (4.10) into (4.9) and recall that the functions <p, 
are orthogonal and normalized, we get 

( 6 . 1 ) 

where 

U P q =^ p U(f q dx (5.2) 

Omitting the second subscript in b qr and denoting the unknown 
quantity by A,, we can write (5.1) as 

S 

Z U pq b q = \b q 

< 7=1 


(5.3) 
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These equations can be interpreted as the equations for the 
eigenfunctions b(q) = b q of the operator represented by the finite 
matrix V pq . This operator is hermitian since its matrix is hermi- 
tian, which can be seen from (5.2). This implies that its eigen¬ 
values X are real. To find these and solve Eq. (5.3) we can use 
a purely algebraic method. Equation (5.3) is a system of homo¬ 
geneous linear equations in b q . Such a system will have a unique 
solution if the system determinant 



U u — X 

u 12 • • • 

U ls 

D(X) = 

U 2l 

U 22 — X ... 

U 2s 


< 4 , 

U s2 

Uss-X 


is zero. The equation D(X) = 0 has s real roots, and each root 
X — X r has a corresponding solution b q = b qr of (5.3). These so¬ 
lutions can be normalized so that 

\b l \ 2 + \b 2 ?+ ... +| P = 1 (5.5) 

It follows from the general properties of linear operators that 
solutions 

b q = b qr and b q = b qp 
corresponding to two distinct roots 

X — X r and X — X p 
are orthogonal to each other, that is 

b\ r b\ p + b 2r b 2p + • • • + b sp b sr = 0, X r =/= X p (5.6) 

The equation £)(X) =0 can also have multiple roots. For 
instance, if X = X r is a double root, there are two independent 
solutions of Eq. (5.3), b q — b' q and b q ~b", corresponding to 
this root. But no matter whether D(X) — 0 has simple or multiple 
roots we can always make all the s solutions of (5.3) orthogonal 
and normalized, so that 

b\ p b\ q + b 2p b 2q + ... + b sp b sq = 6 pq (5.7) 

which implies that matrix b with elements b pq is unitary. Indeed, 
if we assume, as usual, that 


we can write (5.7) as 


b qp — b pq 


2 b pr b rq — bp 


( 5 . 9 ) 
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It then follows, and this can easily be proved, that 


z 

r —1 


bprbrq — fi 


PQ 


(5.10) 


In matrix notation we may write (5.9) and (5.10), which express 
the unitarity of matrix b, in the form 

b + b=l, bb+ = 1 (5.11) 


We have thus found the unitary transformation (4.10) whose 
coefficients satisfy Eq. (5.1), where the E' nr are the roots k, of 
equation D(X) = 0: 

0 (5-12) 

The unitarity of transformation (4.10) shows that if the <p p are 
orthogonal and normalized, will be too. 

We must note that the are determined uniquely only if all 
roots of D(X) =0 are distinct. But if X = is, say, a double 
root, then instead of the solutions b ql = b' and b qi — b" of Eq. 
(5.3) we could have taken 

bq\ — V\\b q \ + v 2 \b qi ^ 

bq2 — V\lb q \ -f- v^b q i 


where the matrix made up of the coefficients v ik is unitary. The 
new solution (5.13) corresponds to new wave functions 


+3 = + °2l^n2 

< 0 2=».2^i + O 2A 


(5.14) 


Hence for each multiple root there is an arbitrary unitary 
transformation of the functions corresponding to this root. Such 
unitary transformations, which remain arbitrary in the first 
approximation, are usually determined from the second and higher 
approximations. 

We can express the roots E' nr — X r in terms of the i|>® f . If we 
denote 

(nr\U\ n'r') = $ dx (5.15) 

then on the basis of (4.8) we get 

(nr\U\n'r') = E' nr \ r , 

which at r = r' yields the sought for expression for E' nr . 


(5.16) 
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6. The first and higher approximations 

We now turn to Eq. (4.7b). With our choice of ^° nr its right- 
hand side satisfies the condition for the existence of a solution. 
The expansion of the right-hand side then reads 

~ U €r + KXr “ - (mp \U\ nr) *° mp -\(E\U\nr) dE (6.1) 

mp 

where we have used (5.15) and put 

(E\U\ n r)=\^ B UMpldT ( 6 . 2 ) 

The prime on the summation sign in (6.1) means that the terms 
with m — n must be omitted from the sum. After solving Eq. (4.7b) 
by the method of Section 2 we get 

<r = XT - aripo- 'Z(mp\U\ nr) 

m « "• p=l 

+ (6-3) 

n p-i 

Here the last sum is the solution of the homogeneous equation. 
The constants in it, c pr , are unknown and must be found from the 
second approximation. In dealing with the second approximation, 
Eq. (4.7c), we must first make sure that its right-hand side is 
orthogonal to all solutions of the homogeneous equation. This 
condition reads 

<«■«> 

If now we substitute the expression (6.3) for t|)' r and introduce 
the notation 

S 

U'qr = Yj - F° Z I U \™P) ( m P\ U \ 

, r (nq\U\E)(E\U\nr) . E 
J £°„-£ 

we can write Eq. (6.4) as 

Kr^^ + (E' nq -E' nr )c qr 

where we have used (5.16). At q #= r this reduces to 

V”* +(.K q -Kr)c q r = °> 1 + r 


(6.5) 

( 6 . 6 ) 
(6.7) 


9-^2186 
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If all the E' nq (q — 1, 2, .... s) are different, that is, all 
roots of D(\) =0 (see Section 5) are simple roots, then (6.7) 
gives us all the c qr with different subscripts, namely 


U. 


Qr 




E nq -Enr 


( 6 . 8 ) 


But if some roots are multiple, for instance, £',= £' 2 , the 
corresponding sums U" r = [/" must vanish. This condition will be 
satisfied if we properly choose the unitary transformation (5.14), 
which up till now has remained arbitrary. Indeed, if we replace 
and ty° n2 by their linear combinations and then U" r 
will change to 

U7r= £ v+ U >kr Q,r=\,2 (6.9) 

t, k “ 1 

But according to (6.6) this must be equal to E" r 6 qr : 

t (6.io) 

I. K — 1 


If we now premultiply this result by v pq and sum with respect 
to q, we get 

tu; k v tr ( 6 . 11 ) 

We have come to equations similar to (5.1) and by using the 
method of Section 5 can find matrix [»,-*]. It could happen, how¬ 
ever, that matrix [o,*] would be determined nonuniquely for the 
same reason as matrix [ b ik \ of Section 5. We would then have to 
bring in the higher approximations. 

Let us assume that we have found [u,*] and that the functions 
tl>° r are “modified”, that is, if necessary, changed to their linear 
combinations. Then E' nq = E' nr will satisfy Eq. (6.7) automatically 
and the corresponding c q , will remain undefined. If E' nq =t=E' nr , we 
can find c qr by using (6.8). If q = r, then according to Eq. (6.6) 
Cn also remains undefined and can be nullified. Then the function 
i]) nr = a|+ eij)' r will be normalized to within terms of the order 
of e 2 . Finally, the second-order energy correction, E" r , will be 

E" r — U" r (6.12) 

where U" r is understood to be the modified U", which before we 
had denoted U*". 
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We have obtained the first approximation to the eigenfunc¬ 
tion tfnr and the second approximation to the energy. In a similar 
manner we could find the higher approximations, but in view of 
the complexity of the formulas they are of no practical interest' 

7. The case of adjacent eigenvalues 

The formulas of Section 3 show that if two eigenvalues^ and E^, 
of the zeroth-order Hamiltonian are adjacent, the denominators 
E° n — E°n' ‘ n expressions (3.9) and (3.12) for \|/ and E" become 
small. Because of this the approximations will be poor, and if the 
denominators E° n — E° n , are of the same order of magnitude as 
the numerators (multiplied by e), the above expressions cannot 
be used at all. In other words, the desired wave functions and 
eigenvalues cannot in this case be expanded in powers of the 
smallness parameter. Instead we can use a method in which the 
calculations are “reshuffled” so that in the terms with small de¬ 
nominators the numerators are zero. In dealing with this method 
we will study only the zeroth-order approximation. 

We write the equation in question as 

Hty — Ei!p (7.1) 

where 

H — H° eU (7.2) 

and assume that the zeroth-order Hamiltonian, H°, has two 
adjacent eigenvalues E° and £“ which correspond to the eigen¬ 
functions tj>y and We ask for that solution to (7.1) for which £ 
is close to E^ (and £°). As the starting approximation we will 
take not ■»(>“ or but their linear combination 

f = + ^ (7.3) 

as was done in Section 3. Indeed, from (3.9) we conclude that the 
leading term in is proportional to For this reason it is 
advisable to include this term in the starting approximation. Fur¬ 
thermore, from (4.10) it follows that in the limiting case, when 
E coincides with E\, the linear combination of and will 
be the zeroth-order approximation for the wave function. Substi¬ 
tuting (7.3) into (7.1), we get the approximate equality 

c { (H — E)y\>i + c 2 (H — E) i|) 2 = 0 (7.4) 

If we premultiply (7.4) first by^i, then by ifo. and in both cases 
integrate, we come to two equations: 

(Hu — E) C\-\- H 12 C 2 == 0 
Hz\Ci -j- (H 22 — E) c% — 0 


9 * 


(7.5) 



132 


Fundamentals of Quantum Mechanics 


where 

H ik =\tffHpldT, i, k= 1,2 (7.6) 


Notice that the off-diagonal matrix elements H n and // 2 i are 
small compared to the diagonal elements H u and H 22 , since for 
the zeroth-order Hamiltonian H° l2 — H 2 i — 0. Equations (7.5) serve 
to determine the coefficients C\ and c 2 and the parameter £. We 
nullify the determinant 


D(£) = 


H n - 


E 




H l2 

H 22 — E 


(7.7) 


and solve the quadratic equation 

E 2 -(H n + H 22 )E+H u H 22 -\H l2 f‘ = 0 (7.8) 

to get for E two values 

E = j (H u + H 22 ) ± i- [(H u - H 22 ) 2 + 4| H l2 1 2 ] 7 * (7.9) 

which we denote E\ and E* 2 . It follows from (7.2) that 

H ik = Elf> ik + eU lk (7.10) 


where Uik is determined by a formula similar to (7.6). Therefore 
H\ 2 is of the order of b, and if £? —• £2 were not small, we could 
have expanded (7.9) in a power series in e. But since this differ¬ 
ence is small by definition, the expansion will either converge 
poorly or even diverge. It is now clear why the method of Section 3 
does not work when two eigenvalues are adjacent. In the first- 
order approximation the two values of £ given by (7.9) yield the 
perturbed eigenvalues £1 and E 2 of H, the eigenvalues that cor¬ 
respond to the zeroth-order eigenvalues £? and £ 2 . We can use 
Eqs. (7.5) to find the coefficients C\ and c 2 . This is most con¬ 
veniently done by introducing two auxiliary real quantities, a 
and 


2//,2 


H\,-H , 


= — tanoa ie 


(7.11) 


From Eqs. (7.5), (7.9), and (7.11) it is easy to see that the 
ratio ci/c 2 takes two values: 
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If we now choose 

Cu == cos c \2 — siny e iw 

c 21 = - sin|-e-W 2 ( c 22 = cos| e~‘M (7.13) 

we get the normalized solutions to (7.5). The corresponding eigen¬ 
functions will be 

= cos y e iB/2 al)° — sin 

^2 = sinye'^ 2 ^ + cos y e -<B/2 i|)® (7.14) 

It is these functions that serve as the starting approximation. If 
we use them to compose the matrix elements of H, we get 

H 'ik = $ dx = E\6 ik , i, k = 1, 2 (7.15) 

so that H* vl — 0. Therefore, if we go on to the next approximation 
and build expressions similar to (3.9) or (3.12) using ^ and 
the remaining functions t|>°, ij>°, ..., we will not get terms with 
the small denominator E° t — E° r This implies that these expres¬ 
sions will indeed represent small corrections. 

8. The anharmonic oscillator 

As an example of how perturbation theory works we will take 
the case of the one-dimensional anharmonic oscillator. We assume 
that the system of units is such that the eigenvalue equation for 
the Hamiltonian is 

+ + {8A) 

where e£ 3 == ell is the perturbation Hamiltonian. We ask for the 
eigenfunction in the first-order approximation and the eigenvalue 
in the second. The eigenfunctions and eigenvalues of the unper¬ 
turbed equation 

1 d 2 ib® I 

-nr+iW.-W, (8-2) 

are already known. They are 

$t = n + Y 


(8.3) 

(8.4) 
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The perturbation operator, U = g 3 , has matrix elements 


+ oo 


(n\U\n') = \ 

— oo 


(8.5) 


and the simplest way to obtain these elements is to multiply the 
matrices g and g 2 of Section 6, Chapter I, whose elements are 

(ft 111«') = (f )' A 6 n -i n ’ + fin+i«' (8-6) 

(ft I i 2 1 «0 = y [ft (” — 1 )1 V * ^ n ~ 2 n ' 

+ (ft + y) + y [(ft + 1) (ft + 2)] /s 6/t+2 n' (8.7) 

We find that 

(n|t/|nO = (ftlS 3 lft')==]T(ftl£|£) (k\l 2 \n') 

k 

( n (n — 1 ) (n — 2 ) \'h * . ( 9n 3 \'h, 

= { -s-J 6 »- 3 »' + 1 t ) 6 «"‘ »' 

+ (1<« + i21)* w 

+ ( (l±JUn+M?± *) )\ n+3n , (8.8) 


The diagonal element of U is zero, which can also be directly 
observed from formula (8.5). Hence the first order correction to 
the eigenvalue vanishes. The approximate expression for the eigen¬ 
function is 

+«. = ^ + e3 K ( 8 -9) 


where according to (3.9) 


£°- 


ib° 
Y m 


( 8 . 10 ) 


In our case only four terms in this sum are nonzero, namely 




(n — Z\U\n) 


E° ■ 


ijj 1 


'n-3 


B-3 


I (n—\\V\n) Q 

‘ r-'O r-'O 


E° n -E° 


n -1 


I (n+\\U\n) n , 
F o p o +B + 1 ' 
£ n~ c n+l 


n-1 

(/t + 3| U\n) 

p0 _ pO 

£ b n n +3 


€ 


B+3 


( 8 . 10 ’) 
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Substituting the matrix elements according to (8.8), we finally 
get 



n (rt — 1) (« — 2) y/j 
8 

v*. 




•n + l 


(«+ D(« + 2) (n + 3) y, 


rt+3 


( 8 . 11 ) 


We still have to calculate, using (3.12), the second-order cor¬ 
rection to the energy. We find that 


1 n (n — 1) (n — 2) 




3 8 '8 

1 (B+l)(« + 2)(n + 3) 
3 8 

15 / o i , 11 \ 
~~\ n + " + 30 ) 


(n + l ) 3 


( 8 . 12 ) 


Hence the energy in this approximation is 

= K + (» + y) - T- e! (»’ + " + w) <»• ■13) 

and we have completed the solution. 

Now let us assume that besides eg 3 the perturbation Hamiltonian 
has a term 6g 4 . The eigenvalue equation will then be 

-j^- + {h 2 +^ 3 +^)^ ==E ^ < 8 - 14 > 

If 6 is of the order of e 2 , the first-order approximation for the 
eigenfunction will remain unchanged and to the expression (8.13) 
for the eigenvalue a new term will be added, the diagonal matrix 
element of 6| 4 . Let us calculate the additional term. We have 

(n|g 4 |n) = (n|g|n—l)(n—l|g 3 |«) 

+ (n|||« + l)(n+lll 3 |n) 

= -|n 2 + |(n+l) 2 

so that 

6(rt|| 4 |rt) = -|6(rt 2 +rt + i) (8.15) 

Adding (8.15) to (8.13) yields the eigenvalue E n : 

£,-(»+-|-) + (4«--¥-«’)(»+|) , + l6-i e « (8.16) 
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In the particular case of 6 = 5e 2 /2 formula (8.16) differs from 
E° n only by a constant term e 2 /2. 

We note in conclusion that by adding terms of type eg 3 or 6£ 4 
to the potential energy of the harmonic oscillator we have changed 
the type of eigenfunction of the Hamiltonian. We have thus come 
to a case mentioned in Section 1, when the formal application of 
perturbation theory leads to divergent series and gives a state 
that is not exactly stationary but quasi-stationary. The formal 
character of the solution appears in the fact that for large values 
of n the correction to the energy in (8.16) ceases to be small. 




Chapter III 


RADIATION, THE THEORY OF DISPERSION, 
AND THE LAW OF DECAY 


1. Classical formulas 

The quantum laws governing phenomena in which the finiteness 
of the speed of propagation of actions (interaction) does not play 
any role have clearly been established. They can be formulated in 
terms of the concepts introduced in Part I and also by bringing in 
a new principle (the famous Pauli exclusion principle) needed for 
statement of the many-body problem. We will examine Pauli’s 
theory of the electron and the many-body problem in Parts III 
and IV. 

On the other hand, the theory of phenomena in which the 
finiteness of the speed of propagation of actions does play a role 
has yet to be completed. These phenomena include, first of all, 
those that are studied in electrodynamics and the theory of rela¬ 
tivity. 

Quantum generalization of each of these two theories requires 
considering systems of an indefinite number of particles (photons 
in the case of electrodynamics and electrons and positrons in rel¬ 
ativistic quantum mechanics). Here the full theory of such systems 
will not be given, nor will the quantum theory of radiation 
(quantum electrodynamics). We will confine ourselves to the 
derivation of formulas using the classical theory as guide. In 
doing so we will allow ourselves a certain degree of inconsistency. 
Namely, we will describe the atomic system in quantum terms and 
the radiation in classical terms, not directly introducing the con¬ 
cept of light quanta (photons). Aside from this, the approximate 
(semiclassical) nature of the theory can serve to justify a certain 
inprecision of expression, such as “the electron is inside a given 
volume” (instead of “the electron can be detected in a given 
volume by a certain kind of experiment”). 

Before considering the theory of radiation let us recall the main 
formulas of the classical theory. The Maxwell equations in Lorentz 
form are 


. 1 dl 4ji 

cur 1 X — — -jf = — pv 


div & — 4np 

(1.1) 

cu rI*+7Tr- 0 


div<?£ = 0 

(1.2) 
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Here <S and X are the electric and magnetic field vectors, p is the 
electric charge density, v is the velocity of the electrons, and c 
is the speed of light. The charge density p and the current density 
pv in the right-hand sides of ( 1 . 1 ) satisfy the equation of con¬ 
tinuity 

+ div ( PV ) = 0 (1.3) 

which, written in the integral form, 

■JT Jpdx=— $pt>„dar (1.4) 

Vo a, 

expresses the fact that the change in the number of electrons inside 
the volume V 0 is equal to the number of electrons that pass 
through the surface o 0 encompassing the volume. Let us put 

&=— gradqp — 7 ^ 7 -, 5^ = curlA (1.5) 

where qp is the scalar potential, and A the vector potential. Both 
potentials we will subject to the usual condition (the Lorentz 
condition) 

divA + f-^ = 0 (1.6) 

By substituting (1.5) into the Maxwell equations (1.1) and 
(1.2) we find that Eqs. (1.2) are satisfied automatically and that 
Eqs. (1.1), if we use condition (1.6), read 

A c % dt 2 c pv 

~ fr-Jr — - 4np (1.7) 

Let us assume that the space in which the fields are considered 
is unlimited. To make the solution of (1.7) unique we will make 
it a condition that there are no ingoing waves, that is, waves that 
arrive from infinity. This can be written 

+ = ° (L8) 

where f is A x , A y , A z or <p. The solution of (1.7) that satisfies this 
condition can be expressed in terms of retarded potentials, namely 

(P)/-| r -r'l/c |r — r'l 


(1.9) 
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If p and pv depend on time via a periodic factor, that is if 

p = poe it0 ', pv = (pv ) 0 e lat ( 1 . 10 ) 

then the above formulas become 

A = \ (pv ) 0 «-'• l r “ r ' Ve 

y = e lwt 5 poe- to|r - r '> /c -j 7 ^j ( 1 . 11 ) 

These are the classical formulas for the electromagnetic field 
generated by a continuous distribution of electric charges. We 
must now change them so as to account for the quantum nature 
of matter. 

2. Charge density and current density 

In the Maxwell equations (1.1) and (1.2) matter was charac¬ 
terized by charge density p and current density pv. We must now 
find the quantum analogues to these quantities. The analogues can 
be of two types — mathematical expectations and operators. Since 
the classical quantities appear in our formulas as functions of 
time, we must choose a representation in which the operators will 
explicitly depend on time (see Section 14, Chapter III, Part I). 

We start with the mathematical expectations. In the classical 
theory p dx is the electric charge inside volume dx. In the quantum 
theory we find the mathematical expectation of a charge by multi¬ 
plying the charge of one electron, — e, into the mathematical 
expectation of the number of electrons. If we have only one 
electron, the latter mathematical expectation is equal to the prob¬ 
ability of the electron being in dx. This, as we know, is simply 

tjn|> dx 

The quantity p then corresponds to 

p-»eif>t|) ( 2 . 1 ) 

Consider a small volume V 0 in which there are N(V o) electrons. 
The mathematical expectation of N is 

M. E. N(V 0 )= ( 2 - 2 ) 

v. 

We construct the time derivative of (2.2). Substituting for tj> 
and their expressions via the wave equation, we get 

■37 $(M>dT = -g- \($Hy — qHyJp)dx 
v, v , 


(2.3) 
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But according to the Schr5dinger equation 

H*— + U* (2.4) 

If we then use Eq. (2.4) and the fact that 

tfV 2 ^ — = div (i|> grad $ — ^ grad t|>) 

we can write formula (2.3) as 

^ dx — — ^ div S dx (2.5) 

v. v, 

where vector S is defined as 

S=== l^'^2r a d't> — grad if) (2.6) 

Applying to (2.5) Gauss’s integral theorem, we get 

■§t \wdx -$S„rfa (2.7) 

v» o» 

where n is the outer normal to the surface o 0 that encompasses V 0 . 
This can be interpreted as meaning that the change in the ma¬ 
thematical expectation of the number of electrons inside V 0 is 
equal to the mathematical expectation of the number of electrons 
passing through ct 0 . The quantity S n da will then be the mathemat¬ 
ical expectation of the number of electrons passing through 
elementary area da in unit time. The numbers are added algebra¬ 
ically, that is, if da is passed in both directions by equal numbers 

of electrons, S„ = 0. Vector S is therefore the electron flux, which 
implies that the classical current density has corresponding to it 
the vector quantity — eS: 

pv —> — e S (2.8) 

Such a definition of S yields the equation 

£^- + divS = 0 (2.9) 

from which it follows that the quantum analogues of p and pv, 
(2.1) and (2.8), satisfy the equation of continuity (1.3). 

We can transform expression (2.6) for S as follows. Put 

ty’=ae ta ( 2 . 10 ) 

where a and a are real. Then, and this is easy to verify, 

S = — a 2 grad a 
m 6 


( 2 . 11 ) 
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which implies that S is parallel to the gradient of the phase of 
the wave function. 

On the other hand, we can express the derivatives dty/dx, 
dty/dy, and dty/dz in (2.6) in terms of the results of applying the 
operators p x , p y , and p z . We then get 

Sx = 4f Wwf + yp*' W (2-1 ■ 2) 

or 

S x = j (♦*♦ + (2.13) 

and similar formulas for the other two components. Here x, y, 
and z are the operators of the respective components of velocity 
(see Section 1, Chapter I). 

Formula (2.13) and the ones for the other two components are, 
from the formal aspect, a natural generalization of the classical 
quantities pi, p y, pz (divided by the electron charge, — e), since 
charge density p corresponds to —etfif and operators x, y, z to the 
components of velocity. 

In Part V we will see that in Dirac's theory of the electron 
vector S can also be represented as (2.13) although the ope¬ 
rators i, y, z are completely dilferent in form from the ones in 

Schrodinger’s theory. 

Thus we have found the expressions for the mathematical 
expectations of the number of electrons in a given volume and 
the number of electrons passing through the surface encompassing 
this volume. We now turn to the problem of finding the operators. 

If we have only one electron, the number of electrons (zero or 
unity) in volume V 0 is a function f v (x, y, z) of the position 
(x, y, z) of our electron, where f is determined as follows: 

f v „{x, y, z)— 1 if (x, y, z) lies inside V 0 

= 0 if (x, y, z) lies outside V 0 (2.14) 

For this reason in terms of x, y, z the operator N(V 0 ) correspond¬ 
ing to the number of electrons will be multiplication into this 
function. The mathematical expectation of this operator will be 
exactly (2.2), which is what we expected. If we want to use the 
Heisenberg picture for operators, we must build the matrix with 
elements 

(n\N\n') = $ ij>„(x, t) f iJv(x, t) dx (2.15) 

where the »|3 n (x,/) are a complete set of functions satisfying the 
Schrddinger equation, for example, the eigenfunctions of the 
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Hamiltonian. Since f is nonzero only inside l/ 0 , where it is equal 
to unity, we get 

(n\ N \ ti')= (2.16) 

Vo 

and integration is carried out only over the points inside vo¬ 
lume K 0 . 

After multiplying this matrix element, which is an element of 
Heisenberg’s matrix for the operator N(V 0 ), by the electron charge 
we come to the analogue of the classical quantity 

p dx -*■ — e \ dx (2.17) 

Vo Vo 

The charge density p then corresponds to 

Pnn' — ~ HMv (2.18) 

This comparison is done in the same way as in Section 8, 
Chapter I, where the elements of Heisenberg’s matrix for coor¬ 
dinate x were compared with the classical expressions for the 
same quantity. 

After we have found the quantum analogue of charge density p, 
we can derive the quantum analogue of the current density pv 
in the same way that we derived (2.8) from (2.1). 

Recalling (2.5) and (2.6), we get 

-ji \ 'MV dx = — J div S„„' dx (2.19) 

V. Vo 

where S„„' is the vector 

S n „' = -gj- (i|v grad iji„ — grad iJv) (2.20) 

We can then relate pv with 

(pvW = — eS„„' (2.21) 

Thus (2.12) and (2.13) transform into 

(S x )nn' = + %Px'h') (2-22) 

or 

(S X )nn'= J (2-23) 

We obtained the quantum formulas for the charge and current 
densities assuming only one electron, but we can easily generalize 
this case for several electrons. 
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3. Frequencies and intensities 

What remains is to substitute the elements of Heisenberg’s 
matrices into the classical formulas of Section 1. First we note 
that if the $„(*, t) are the eigenfunctions of the Hamiltonian, that 

♦ n (*.0-« _<V X(*) (3-1) 

then the dependence of p„„- and (pv)„„' on time is purely periodic 
with an angular frequency 


«W = 2nv„„' = (£„ — E n -)/h (3.2) 


The electromagnetic field (1.11) will obviously have the same 
frequency. Hence the Bohr frequency relation , according to which 
the frequency of radiation emitted by an atom is equal to the 
energy difference E n — E n ' between two stationary states divided 
by Planck’s constant h = 2nh. It is convenient then to link the 
emission process with the quantum transition (“jump”) of the 
atom from one stationary state to another. 

If we substitute (2.18) and (2.21) (the charge and current 
densities) into the formula for the potentials, (1.9) or (1.11), we 
have 2 


A 


nti' 


<JW = 


- f (S„«- 


dx‘ 

1 r — r' 1 
dx' 


(3.3) 


We will now assume that the emitted radiation has a wavelength 

Kn' — 2jtc/gw (3.4) 

much greater than the dimensions of the atomic system. This 
implies that we can disregard the difference in the retardation of 
the electromagnetic fields emitted by different parts of the system. 3 
On this assumption the factor |r — r' | _1 e~ i<D '» n ' 1 r-r l/c in the inte¬ 
grand will not change considerably in the region where 
and S„„ - do not vanish, and in the expression for we can 
disregard r' in comparison with r and take the factor outside the 


* In the formulas e can be either the base of natural logarithms or the elec¬ 
tron charge. But obviously there can be no confusion. 

8 This difference can be taken into account in the higher-order approximations. 
It is imperative to do so if the electromagnetic field vanishes (see the selection 
rules below). 
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integral sign. In the expression for qw we can replace this 
factor by a linear function of position. In summary we get 

A„„. -- f 

»„„•- T e ~ S (3.5) 

where according to (2.23) 

\ S dx = y J + ^>„vt|v) dx 

Since the velocity operator is hermitian, both terms in the 
integral on the right-hand side are equal, which yields 

J S„„' dx= jj ij>„Vl|V dx (3.6) 

By virtue of the equations of motion the matrix element for 
velocity is equal to the time derivative of the corresponding 
matrix elements for position: 

J ij> n vi|v dx' = $ rMi|v dx' = (3.7) 

where 

x„„' = J dx' (3.8) 

On the other hand, due to the orthonormality of the functions 
we have 

\ dx = 6 nn - (3.9) 

We then substitute these expressions into (3.5) and at first put 
n = ri'. Since the diagonal element x„ n - does not depend on time, 
we find that 
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we can write 



x 0 L e la nn' 
nn* f 


(3.12) 


instead of the first equation in (3.11). 

We now use formulas (1.5) for the electric and magnetic fields. 
We assume that, although the wavelength is long on the atomic 
scale, it nevertheless is short compared to the distance to the 
point where the field is calculated. To simplify matters we will 
consider only the contribution of the vector potential. This is 
justified by the fact that the scalar potential contributes only to 
the numerical value of a constant in the final formulas. We will 
then have 

&. = - K^~e lann ' (i ~ r,c) 

nn c nn r 

Mnn' = &nn' X grad T (3.13)‘ 

These formulas provide an insight into the polarization and 
intensity of light of a frequency corresponding to a definite spec¬ 
tral line. If, for instance, for each pair (n, n') (which actually 
means a definite transition) one of the components of 
x„„', say z nn ', is nonzero and the two others, x nn ' and y nn ', are 
zero, the light is polarized along the z axis. But if x nn ' ^ 0, 
tfnn’ ¥* 0, andz n „' = 0, the light is polarized in the xy plane. For 
some pairs (n, n') the matrix elements of all three coordinates, 
x nn y n n' and z n n', might vanish. Then the lines corresponding to 
these pairs would be absent. Such transitions are said to be 
forbidden. In many cases there are rules that make it possible to 
decide whether a line is forbidden or not. These are called selec¬ 
tion rules. 

To estimate the radiation intensity we build the time average 
of the Poynting vector. For this we take the real parts <§' and 
of <S and W. The Poynting vector is then 

P = ^ r (ffX»)-^-C-gradr (3.14) 


and the time average of P is 


<P> 


g 2<a nn' 

8jtc 3 

eW 


ir grad r 


= (I *!»■ P +1»!»- f +1 *“.«■ P) y < 3 ' 15 > 

Hence the intensity of the spectral line with frequency <a nn ' 
is proportional to 

“ ( K-1‘ + I | ! +1 <( I 1 ) (3-16) 


10—2186 
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so that the ratio of intensities of two spectral lines is given by the 
ratio of the corresponding I nn ' s. 

If we introduce the electric dipole moment of the electron, D, 
in the following way: 

D x — — ex, D y — — ey, D z — — ez (3.17; 
then formula (3.16) can be written as 

—<„■ [ | (»| O, I'»')| ! +1 (»I D, |, n') I* +1 (»| D, |. n’) |>] (3.118) 
or simply 

/ nn ' = ^K«l D l«')l 2 (3-18*) 


In the case when the atom has several electrons, (3.18) can 
still serve as the measure of radiation intensity if D is understood 
to be the total electric moment of all the electrons. The selection 
rule will then state which elements of Heisenberg’s matrix for 
total moment D are nonzero. 

Let us illustrate the above facts by considering the one-dimen¬ 
sional oscillator. In Chapter I we found that the elements of 
Heisenberg’s matrix for coordinate x are 


— e 


iat 


( nh V/» 
V 2ma> ) 


6 


n -1 



(« + 1) fl 

2ma> ) n +i»' 


(3.19) 


[see (8.8) of Chapter I]. It then follows that the nonzero matrix 
elements (and hence intensities) are those whose quantum num¬ 
bers n and n' differ by unity: 

n — n' = ± 1 (3.20) 


This is the selection rule for the oscillator. The frequency of 
these allowed transitions is equal to the oscillator’s fundamental 
frequency, and there are no higher harmonics in the radiation. The 
measure of intensity for the oscillator is 


1 n— 1 n ‘ 


fitoV 
2 m 


n 


(3.21) 


Note that the intensity here is proportional to the quantum num¬ 
ber, n. 


4. Intensities in a continuous spectrum 

For a continuous spectrum the formulas for intensities must be 
modified. Let us assume that an electron is ejected from the 
energy level E n to infinity, where its kinetic energy is E (an ab- 
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sorption spectrum). In this case we speak not of the intensity of 
the monochromatic light of frequency w, where 

<* = (E-E n )/h (4.1) 

but of the intensity of light with a frequency range 

Aco = &E/h (4.2) 

This implies that in Heisenberg’s matrices we must shift from 
the eigenfunction it>^(*) to the proper differential corresponding to 
the integral 

E+AE 

7=5* ""IE!*- j ♦< x - E)dE (4 ’ 3) 

and normalized in a way such that 

lim -^-(lATprft^l (4.4) 

AE-*0 J 

In other respects our results are the same as for the discrete 
spectrum. 

The above substitution forces us to replace the matrix element 
Xnn' = (En\x\ E n ') (4.5) 

with 

(£J^|£)(A£) ,/j = ^^5^WxA'PdT (4.6) 

But AT of (4.6) is approximately 

AT«iK*, £)A£ (4.7) 

This approximation can only be used if substituting (4.7) for AT 
does not change the convergence of the integral in AT. In our 
case it does not, since in (4.6) AT is multiplied into ^„(x), which 
is a rapidly decreasing function. 4 For this reason we can write 
the matrix element (4.6) as 

(E n \x\E) = ^ ( x) xv|) (x, E) dx (4.8) 

Now we use the same method for coordinates y and z and in 
(3.15) make the substitution 

Xnn'—* (En\x\E) (AE)’^ (4.9) 

We get the intensity of light per interval A E — h Aco: 

IJE)AE^eW n [\(E n \x\E)\^+\(E n \y\E^+\(E n \z\E)\^E (4.10) 

* In (4.4), however, we cannot use (4.7). 

10 * 
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We must use this formula instead of (3.16) if we are dealing 
with a continuous spectrum. 

5. Perturbation of an atom by a light wave 

When a plane monochromatic light wave impinges on an atomic 
system, it initiates additional radiation whose frequency can be 
equal (a) to that of the incident wave and (b) to the sum or 
difference of the incident-wave frequency and the natural frequen¬ 
cies of the atom (the Raman effect). The additional radiation 
interferes with the incident wave, which results in a plane wave 
of a different wavelength. The change in wavelength in a medium 
can be described by the medium’s index of refraction or its dielec¬ 
tric constant. Both, generally speaking, will depend on the fre¬ 
quency. Hence there will be dispersion of light. 

We will now briefly analyze the Raman effect. Our problem 
consists of two steps. First, we must find the approximate solution 
to the Schrodinger wave equation with a perturbation in the form 
of a light wave. Second, we must find the frequencies and inten¬ 
sities of the additional radiation and derive the formula for the 
dielectric constant. 

We start with the perturbed wave equation. If the wavelength 
of the incident light is considerably longer than atomic dimen¬ 
sions, the perturbation Hamiltonian is approximately 

U = -(D x $ x + D y & u + D z & z ) = -( D<?) (5.1) 

where t§ is the electric field vector, and D is the vector of the 
electric dipole moment of the atom. When there is one valence 
electron, 

D x — — ex, D y — — ey, D z — — ez (5.2) 

The wave is considered monochromatic, so that 

^ = i^o(e'“' + e- to ') (5.3) 

The time dependence of the perturbation Hamiltonian will then be 

U = U 0 e imt + (5.4) 

where Uo and Ut do not depend on time. The notation Uo is 
justified by the fact that Uo is the hermitian conjugate of U 0 , in 
the sense of the linear operator theory. 

If H° is the zeroth-order Hamiltonian, the wave equation whose 
approximate solution we are trying to find will have the follow¬ 
ing form: 

H°\ 4 - (Uoe l<ot + ty 0 + e' to< ) $ ~ itl = 0 


( 5 . 5 ) 
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We consider the external electric field to be weak, and in looking 
for ip we limit ourselves to the first-order approximation. This 
implies that we must take ip in form 

ip = 4 . v + w (5.6) 


where v and w are small. If we put (5.6) into (5.5) and neglect 
terms of the order of Uv and Uw, we get 


H° ip* + H°v + H°w + t/ 0 e'°V + £/oV“V 


ih 


dip* 


dt 


ih% + ih- 


dw 

dt 


To satisfy this equation we will adjust the functions ip*, v, and 
w in a way such that 

tfV — ib ijr = 0 (5.7) 

H°v - ih^- = - (V'“ ( ip* (5.8) 

H°w -ih~ = - Uo e -iM, ip* (5.8*) 


Equation (5.7) is the unperturbed wave equation. Its solution is 

ip* = ip°e~ <£ '»*/* (5.9) 

If we then substitute (5.9) into Eqs. (5.8) and (5.8*), we see that 
these equations will be satisfied if 

v = v y(*~ B nl h ) t ( 5 . 10 ) 

w = w 0 n e- l ^ +E n/ k ) t (5.10*) 

where v° n and w° n do not depend on time. For the last two func¬ 
tions we have the following equations: 

- (£„ - H v° n = - v 0 € (5.11) 

~ (B n + Aco) w ° n = - U^ 0 n (5.11*) 

When solving these equations, we must assume that E n ± hut 
do not coincide with any of the eigenvalues of H°, that is 

E n — E n ' ± ha =£■ 0 (5. 12 ) 

or 

|aw|=^|a>| (5.13) 

where 

fi >nn' = (E n -En')/h (5.14) 

Condition (5.13) implies that there is no resonance between 

natural frequency aw of the atom and frequency a> of the light 
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wave. If there is resonance, our method of solving Eq. (5.5), 
based on the assumption that v° n and w° n are small, does not work. 

When there is no resonance, that is, when condition (5.13) 
holds, Eqs. (5.11) and (5.11*) can be solved by the method of 
Section 2, Chapter II. If to simplify matters we assume that there 
are no degenerate eigenvalues and no continuous spectrum, we 
get 


where 


w 


. V (m\U 0 \n) n 

Zj E m — E n + h<s> 

■m 

(5.15) 

_ y («KI«) - 

’ L, E m -E n + tm 

m 

(5.15*) 

m 1 ^0 1 «) = $ ♦mt/o’tt dx 

(5.16) 


Since we assumed that the wavelength of the incident light is 
much longer than the atomic dimensions, the amplitude of 8 
can be considered constant within the limits of integration in 
(5.16). Formulas (5.1), (5.3), and (5.4) then yield 

(m\U\n) = -±-[8 0 x (m\D° x \n) + 8° u ( m\D° y \n) + 8° z (m|D°|«)] 

= --i-(/n|S«.D 0 |n) (5.17) 

where 

(rn|D«|n)=^° m D<dT (5.18) 

Finally, using (5.6), (5.9), and (5.10), we can write the approx¬ 
imate solution to Eq. (5.5): 

%=+ v V at + w Y m ) (5-i9) 

where v° n and w° n are found by (5.15) and (5.15*). 


6 . The dispersion formula 

In the classical electron theory we characterize the radiation of 
an atom by its electric dipole moment. The quantum analogue of 
this moment is, as we saw in Section 3, an element of a Heisen¬ 
berg’s matrix corresponding to the product of the electron charge 
by its coordinate, or the sum of such products if there are several 
electrons. 

The incident wave induces an additional electric dipole moment, 
which for an optically homogeneous medium is proportional to 
the electric field. In the general case this moment will be a linear 
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vector function of the electric field components. It is the frequency 
dependence of the proportionality factor (or the factors of the 
vector function) that serves as the basis for explaining dispersion. 

If we know the approximate solution of the wave equation 
when the perturbation is in the form of a light wave, we can 
build the elements of Heisenberg’s matrix for the electric dipole 
moment. Neglecting the squares and products of v° n and w° n , we get 

^ dx = e‘ <i>nn ' i (n | D° | n ') 

+ J Cnn'e 1 < (lW+ “>' + \ c;+*' ‘ (6.1) 

where 

C„„, = 2 J <Dr|>°, dx + 2 J $PDo», dx (6.2) 

<W = 2 \ dx + 2\ HDw° a . dx ( 6 . 2 *) 

so that 

C nn' === C n 'n (6-3) 


which justifies the “dagger” in (6.2*). If in (6.2) we now substi¬ 
tute (5.15) and (5.15*) for v° n and w° n and use (5.17), we get 


-Z 


(n | g° • D° | m) (w | P° | n') 
* (®m» ~ ®) 


+ £ 


(n | D° | m) (m [ ff° • D° | n') 

® (®M' + “) 


(6.4) 


We see that in addition to the natural frequency of the atom, 
formula ( 6 . 1 ) contains the sum and difference gw ±co (the 
Raman effect ). The terms 



(6.5) 


in ( 6 . 1 ) represent the additional dipole moment induced by the 
perturbation, the light wave. The diagonal element of this moment, 


K 


D' = 

nn 




i&t 


+ j C nn e 


-liol 


( 6 . 6 ) 


has the frequency of the incident wave. For this reason the 
diagonal element is the immediate counterpart of the classical 
additional moment. 

Let us now turn to the dependence of D' on the electric field 
components. At n = n' the expression (6.4) for C nn , yields 

TO. - Re [(«„). + («„). V + («=). «V“] 

TO. = Re [(-V)» V' + («,.). *,«" + («„). *.«"] 

TO.=Re [(«,.). +K). +(«,.). M 
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where, for instance, 


/. ) y ( ( n \ D l\ m )( m K\ n ) , ("l^l")( w l p y|") ’\ , 68 . 

( xy)n ~~ L \ ft (o M - •) + A (<o m „ + ®) ) 


The other coefficients are expressed by similar formulas, in which 
x and y are changed to the appropriate labels. 

The array 

( a xx a xy a xz \ 

a yx a y y a yz 1 (6.9) 

a zx a zy a zz 

(where we omitted label n) is a hermitian matrix, so that, for 
instance, 

a y x = a xy ( 6 . 10 ) 

Thus the additional dipole moment D' is a linear vector function 
of the electric field, and for complex coefficients in (6.9) the phase 
of D' does not coincide with the phase of S. But if the a’s are 
real (which is the case when the functions i|>® are real, for in¬ 
stance), the phases of D' and & coincide and Eqs. (6.7) read 

(o'.). - («.,)„ +K). *. +K,). *. 

«)»- (<■„,). *, +(«„). *.+(<■„). 

«). - («„)„ *. + K). *, + (<■«). ( 611 ) 

where & is the real electric field vector. A particularly interest¬ 
ing case is when 

a xx ~ a yy ~ a zz ~ a n 

a j iz == a zy ~ a xy — 0 (6.12) 

and the dependence of D' on & simply reduces to proportionality: 

D; = a (6.13) 

Up to now we have dealt with the additional dipole moment for 
one particle. To find the total electric dipole moment per unit 
volume, D*, we denote by N r the number of particles in state r 
per unit volume and construct the sum 

D*=£W r D; (6.14) 

r 

which is 


where 


D m = a8 

0t = £ Nra r 


(6.15) 

(6.16) 




Schrodinger’s Theory 


153 


According to the classical electron theory the dielectric con¬ 
stant e and the proportionality factor a in (6.15) are related in 
the following way; 


3 e - 1 _ 
4n e + 2 a 


(6.17) 


Hence our formulas connect the dielectric constant with atomic 
quantities. 

We note that N r of (6.16) depends on temperature. Classical 
Boltzmann statistics gives this dependence as 


N r = N 


exp (- E r /kT) 
£ exp (- E r /kT) 


(6.18) 


where N is the total number of atoms per unit volume, and E, 
is the energy of one atom in state r. As to a r , it depends only on 
the properties of the particles and on the frequency of the incident 
wave, ( 0 . This frequency dependence of a explains dispersion. We 
have just seen that to enters the expression for a through the 
denominators to mn dh to, characteristic of the dispersion formula. 

The theory presented in this section is no more than an outline, 
giving a general idea of the effect of light on atom. It is far from 
complete for several reasons. First, we have only briefly mentioned 
the relation between quantities referring to one atom or molecule 
(the dipole moment, for instance) and macroscopic quantities (for 
instance, the dielectric constant), and for the distribution by 
state we have confined ourselves to the classical formula (6.18). 
Second, even quantities referring to one atom were described 
semiclassically since we did not introduce the concept of light 
quanta, and schematically since we did not touch on the problem 
of degenerate eigenvalues, for instance, and did not clarify in 
what conditions (6.12) holds. Last, we said nothing about what 
happens if there is resonance and did not discuss the question of 
the width of a spectral line. 

As we said at the beginning of this chapter, a full exposition of 
the theory of radiation goes beyond the subject of this book. 


7. Penetration of a potential barrier by a particle 

In this section we will consider a problem that in classical terms 
could be called the penetration of a potential barrier by a particle. 
In essence this problem is nonstationary, but we will use wave 
functions that are the eigenfunctions of the Hamiltonian as aux¬ 
iliary quantities. 

Let us assume that the potential U depends only on the distance 
from an attractive centre, that is, the problem is spherically 
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symmetric. We will also deal only with states in which the wave 
functions depend (apart from time) on r: 

= 0 (7.1) 

(The general case is considered in Chapter IV.) If we put 


i|> = 


nr) 


V 2 ty + Uyb = iti — 


dt 


(7.2) 

(7.3) 

(7.4) 


then the Schrodinger equation 

2 tn 

transforms into 

fi 2 d*f i fit — :h df_ 

2m dr 2 + U] ~ lft dt 

For a state with definite energy (a stationary state) we can 
assume that 

f(r, t) — f (r) e~ iEt,h (7.5) 

Then Eq. (7.4) reads 

-■Br^ + W(r)-BU-o (7.6) 


For the function to remain finite everywhere, we must assume 
the following boundary condition: 

/ (0) = 0 (7.7) 

Now we make certain assumptions concerning the behaviour of 
U(r). Let U(r) first increase monotonically, starting from some 
definite value (or even from — oo, which is the case for an 
electron in the Coulomb field of a nucleus, U = — e 2 /r). In the 
process U(r) attains a maximum value, after which it decreases 
without restriction. We will consider the values of E that are less 
than the maximum of U(r). The difference U(r) — E vanishes at 
two points, before and after the maximum, say at r = r x and 
r = r 2 . Hence for r there are three distinct regions 

I. 0 < r < r u II. r, < r < r 2 , III. r 2 < r (7.8) 

How does f behave in each region? In the first, where U < E, 
function / will oscillate. If we put 

r 

5, (r) = J [2m {E - U)]' h dr (7.9) 

o 

then in the semiclassical approximation (Section 15, Chapter III, 
Part I) and for r’s that are smaller than r t (but not too close to r t ) 
we have 


f = c (l7w) V ’ cos ( S >/ ,i + a ) 


( 7 . 10 ) 
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where c and a are constants. In the second region we put 


S 2 (r)=\[2m{U - E)t dr 

(7.11) 

n 

Then approximately 


f = ( dr d E ) (ci£ Sl/ " + c 2 e~ s ‘l h ) 

(7.12) 

In the third region 

/ = c' ( ) cos (S 3 /h + a') 

(7.13) 

where 

r 


S 3 = j [2m (E - U)i k dr 

(7.14) 


r 2 


and c' and a' are new constants. In view of (7.2), function ^ 
differs from / by a factor, 1 /r, that tends to zero as r->oo. For this 
reason the behaviour of ij) in the first and third regions will be 
different and will tend to zero as r-*■ oo. 

In the transition region, where the difference U(r) —E in the 
Schrodinger equation (7.6) changes sign (that is, near the “turn¬ 
ing” points r = r t and r — r 2 ), the above formulas for f and 5 
do not hold; there the approximate solutions can be expressed 
using the Airy functions, which are the solutions of Airy’s differen¬ 
tial equation 

w"(t) = tw(t) (7.15) 

in which the coefficient of the unknown function passes through 
zero. We will not discuss the use of the Airy functions as applied 
to our problem. 

The most interesting case is that of a very high barrier. Such a 
barrier is characterized by 

j-S = i-S 2 (r 2 ) (7.16) 

being considerably greater than unity. The behaviour of f (and 
of i|)) will depend essentially on the coefficients in (7.12). If Ci = 0, 
the first term in (7.12) vanishes and as r increases from r\ to r 2 
the function f rapidly decreases, so that the amplitude of will 
be considerably smaller in region III than in region I. This means 
that the probability (more correctly, the probability density) of 
finding the particle inside the barrier is much greater than of find¬ 
ing it outside. 

For arbitrary values of E we cannot make C\ vanish. The con¬ 
dition C\(E)—0 selects certain values of E. (This was the case 
with the standard problem of Schrodinger’s theory, where the 
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eigenvalues of the Hamiltonian were selected.) Let E = E 0 be 
one of such values. The corresponding eigenfunction will be\|) £ (r). 

Since this function is nonzero outside the barrier, we cannot 
consider it the correct function describing the particle inside the 
barrier. To describe such a state we introduce a function 
that is close to i|) £ (r), namely 

♦o ( r ) at 0 < r < r 2 

= 0 at r 2 <r (7.17) 

with normalization 

oo r 3 

J |* 0 (r)|V 2 dr=$ |t £o (r)|Wr=l (7.18) 

o o 

In the state described by i|)o(r) the particle is inside the barrier. 
This state, however, is nonstationary since does not satisfy the 
continuity condition at r — r 2 , which is a necessary condition for 
any energy eigenfunction. Nevertheless, the state will in a sense 
be quasi-stationary. This means that if at time / = 0 it was de¬ 
scribed by ifo(r), for which the probability of finding the particle 
inside the barrier is unity, in subsequent moments the probability 
will be a slowly decreasing function of time. The law of such 
slowly decreasing probability can be called the law of decay of a 
system in a quasi-stationary state. 

In classical mechanics the decay of a system would be as¬ 
sociated with passage of particles over the potential barrier. If a 
particle could always be localized in space, we would have to 
say that in the region above the barrier the particle’s kinetic 
energy takes on negative values, which is impossible. On the 
other hand, such phenomena as the escape of a-particles from an 
atom’s nucleus or the ionization of atoms in an external electric 
field prove that decay of systems of such types is indeed possible. 
If we now compare the two conclusions, we automatically deduce 
that we cannot apply classical concepts to the above phenomena 
and that the interpretation of such phenomena requires new 
(quantum) concepts. But according to quantum mechanics we 
cannot say that a particle is above a barrier until we show the 
way this can be determined. In turn, to find the particle above the 
barrier we must impart the lacking energy to it. This is the way 
the above paradox is removed. 

8. The law of decay of a quasi-stationary state 

We can formulate the law of decay of a quasi-stationary state 
of a system for a fairly general case if we introduce the energy 
distribution function in this state. 
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Denote by x the set of all coordinates (or other variables, in 
terms of which the wave function is expressed). Let tyo = rp(x:, 0) 
be the initial value of the system’s wave function, t| i(x,t). We 
express \|>o as an integral of the eigenfunctions of the Hamilton¬ 
ian, iM*): 

^ (x, 0) = ^ c ( E) y[> E (x) dE (8.1) 


Then at t > 0 the state of the system will be 

tp (x, 0=5 e ~ iEtli>c (E) ipf (x) dE (8.2) 


The probability L(t) that at time t the systetn can be found in 
its initial state is the squared modulus of the scalar product 

P (0 = J ’M*. 0) i|) (x, 0 dx (8.3) 

so that we have 

L(0 = IP(0I 2 (8.4) 


Using the completeness condition for the functions if>£(x), we 
find that p(t) can also be expressed as the scalar product of the 
coefficients of (8.1) and (8.2). We have 


But 


p (0 = J e- iEt ' h c (E) c (E) dE (8.5) 

dW (E) = w (E) dE = \ c (E) f dE (8.6) 


is the energy distribution function for the initial state (which 
means that it is the distribution function at any time t>0). 
Hence (8.5) reads 

p (0 = J e- lEt ' h w (E) dE = J e- lEtlh dW (E) (8.7) 


Thus the probability that the system has not yet decayed at 
time t is 

L (0 = | p (0 I 2 = | J e- lEt ' h dW (E) | 2 (8.8) 

which implies that 

The law of decay of state depends only on the energy dis¬ 
tribution function. 


With the appropriate choice of integral distribution function W(E), 
Eq. (8.8) is valid for a discontinuous W(E) (discrete spectrum). 

We note that the law of decay can be the same for two different 
states provided their energy distribution functions are the same. 
We must also note that time t in the expression for the probability 
of decay is counted off starting from the (latest) moment when 
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we can still say that the atom (or system) has not yet decayed; 
the state of the undecayed atom, \Jj 0 , does not change. If we wish, 
we can say that the atom does not age but decays suddenly. This 
corollary is valid for any law of decay, not only the exponential 
law. 

Interestingly, in formula (8.7) it is not the probability amplitude 
(wave function) that is subject to a Fourier transformation, which 
would be the case in quantum mechanics, but the probability 
itself. According to the nomenclature of probability theory p(t) is 
the characteristic function for E. 

The properties of the Fourier integral yield the relation be¬ 
tween the rate of decay and the smoothness of the distribution 
function. Let us first investigate the conditions in which decay is 
possible at all. 

If there is a differential energy-distribution function (proba¬ 
bility density) w(E), the integral distribution function W (E) is 
related to it as 

E' 

W (£') - W (E) = $ w (E) dE (8.9) 

E 

where E' and E are any two values of energy. If E' > E, the above 
expression is obviously the probability that the energy of the 
system lies between E and E'. If there is no discrete spectrum, 
W(E) is continuous for any initial state. But if there is a discrete 
spectrum, W(E) will be continuous only if in the initial state all 
probabilities concerning the discrete spectrum vanish. 

Assume that W(E) is continuous. Then (8.9) implies that the 
continuity of W{E) is equivalent to the absolute integrability of 
w{E) (in the usual meaning). But if w(E) is absolute integrable, 
the value of (8.5) tends to zero as t oo. 

Thus the continuity of W(E) yields 

L(f)-»0 as t ► oo (8.10) 

where according to (8.7) and (8.8), L (t) is the probability that 
at time t the system has not yet decayed. On the other hand we 
can show that the continuity of W(E) follows from (8.10). We 
come to the conclusion that 

A system decays if and only if the integral energy-distribution 

function is continuous. 

In many problems the energy distribution function satisfies much 
stricter conditions than the simple continuity of W(E). For in¬ 
stance, when a particle being in a potential well penetrates a bar- 
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rier, the probability density w(E) will be a meromorphic function 5 
of a complex variable (this problem was considered in Section 7). 
Since at real values of E the function w(E) is real, its poles lie 
symmetrically with respect to the real axis, and the residues at 
each symmetric pair of poles will be complex conjugates (the 
physical meaning of w(E) as probability density implies that it 
cannot have any poles on the real axis). Let the pair of poles 
closest to the real axis be 

E = E 0 ±ir, T>0 (8.11) 


Let the next pair of poles have imaginary parts ±tT'. It is easy 
to see that if t is so great that 



» 1 


( 8 . 12 ) 


then the value of integral (8.7) will be determined by the residues 
at the two poles (8.11), whereas none of the other poles will 
contribute. So we can assume that w(E) has only two poles, 
which means that we can put 6 

w(E) — — ( E _ Eo y + T i (8.13) 


This is the dispersion formula for the energy distribution. 
If we now substitute (8.13) into (8.7), we get 
p(f) = e-<£</*-r|fl/* 

and hence 

L(/) = e-2r|<l/A 


(8.14) 

(8.15) 


Thus to obtain the usual exponential form of the decay law it 
is sufficient to assume that w(E) is meromorphic — an assumption 
that can be proved by analyzing the Schrodinger equation of the 
problem. 


5 A function is meromorphic throughout a region D if and only if its only 
singularities throughout D are poles. 

0 We assume that r < £ 0 — £*, where £* is the lower limit of integration 
in (8.7); usually £* = 0. 




Chapter IV 


AN ELECTRON IN A CENTRAL FIELD 


1. General remarks 

The problem of describing the states of an electron in a central 
field is of great practical importance, since its solution gives us 
not only the theory of the spectrum of hydrogen (an electron mov¬ 
ing in the Coulomb field) but the approximate theory of the 
spectra of atoms with one valence electron (the sodium atom, for 
instance). 

In the hydrogen atom the electron is in the electrostatic Coulomb 
field of the nucleus, so that the potential energy (or simply po¬ 
tential) is 



In atoms with several electrons the electrons lose their individu¬ 
ality in a way, and, generally speaking, we cannot consider the 
states of separate electrons and describe them by using wave 
functions if that depend on the coordinates of each electron respec¬ 
tively. Instead we must examine the state of the atom as a whole 
and describe it by a wave function that depends on the coordinates 
of all the electrons. We must also take into account the intrinsic 
degree of freedom (the so-called spin) and the symmetry of the 
wave function with respect to the interchange of any two electrons 
(exchange symmetry). The many-electron problem is dealt with 
in Part IV. We will only remark that in a certain approximation 
the wave function of the atom as a whole can be expressed in 
terms of the wave function of each separate electron. In this case 
for the “one-electron” wave functions we get equations similar to 
those of the one-body problem (with certain additional terms). In 
view of this for an atom with one valence electron we can build 
an equation for the electron wave function and speak of the elec¬ 
tron being in the field of the nucleus and the other (the inner) 
electrons. As in the case of the hydrogen atom, this field will be 
spherically symmetric, but it will not be a Coulomb field. Consider¬ 
ing what has been said, the case involving the non-Coulomb field 
U(r), which depends only on the distance from the nucleus, is of 
great importance to physics. 

Schrodinger’s theory gives a true picture in general outline of 
atoms with one valence electron. Only a few details, notably, the 
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fine structure of energy levels (the existence of doublets), cannot 
be derived from the Schrodinger equation but can be interpreted 
by Dirac’s theory, which takes the theory of relativity into account. 
More than that, Dirac’s theory is needed to explain the behaviour 
of an atom in a magnetic field (the Zeeman effect). True, the 
Schrodinger equation can be generalized for the case of a mag¬ 
netic field, but since the corrections due to the magnetic field and 
the theory of relativity are of the same order of magnitude, they 
must be considered simultaneously. Dirac’s theory of the electron 
is elaborated in Part V. 

2. Conservation of angular momentum 7 
The Schrodinger wave equation for an electron in a field with 


potential energy U(x,y,z) is 

/^-r7t|f = 0 (2.1) 

where the Hamiltonian is 

H = -hripl + pI + pI) + u y> 2 > ( 2 - 2 ) 

We assume that U depends only on the distance from the 
atom’s nucleus, the latter being fixed and at the origin of a coor¬ 
dinate system: 

U (x, y, z) — U (r), r = (x* + U 2 + (2.3) 

The wave equation will then be 

-^(pl + Pl + Pl)* + U(r)$-ih^ = 0 (2.4) 

or, if we express p x , p y , p z in terms of position derivatives, 

+ =° ( 2 - 5 ) 

where V 2 is the Laplacian operator. 


In the case of a central field in classical mechanics we have the 
following conservation law: 

The components of angular momentum about the origin of 
coordinates 

m x = yPz — zp v 
m y — zp x — xp z 

tn z — xp y — yp x (2.6) 

are constants of the motion. 


1 The law of conservation of angular momentum for a particle moving in 
a central field is sometimes called the area integral. 


I i—2186 
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But will these quantities be constants of the motion in quantum 
mechanics if we assume them to be the operators considered in 
Section 7, Chapter III, Part I? For this we must see whether they 
commute with the Hamiltonian. One way is to use the Poisson 
bracket of Section 7, Chapter III, Part I. However, it is simpler 
to find the commutation relation directly. 

We start with 

m . h ( a* \ 

>M> — i\ x dy y dx ) 

Hence 

<W»,) + - mW -4[V - y -ft)- xSg- + y 


We can prove in a similar way that m z commutes with U(r), 
namely 


(Um t ) ij) - mJJty = — [U (x dy 
h ( du 

-T {y-d7- 
since if U = U (r) then 




dU$ 

dy 


+ y 


dU i|)' 
dx . 


dU 

dy 


)ij) = 0 


dU dU n 
y-z 7 -x —=0 


dy 


Thus m* commutes both with the Laplacian operator and with 
the potential energy, which implies that it commutes with the 
Hamiltonian. In view of the symmetry between the coordinates 
x, y, z the same is true for m x and m y . Hence 


Hm x — m x H = 0 
Hm y — m y H — 0 

Hm z — m z H — 0 (2.7) 

which means that 

The components of angular momentum are constants of the 
motion in quantum mechanics. 

However, these operators do not commute with each other. Indeed, 
we already know that 

m u m z — m z m y = itlm x 
m z m x — m x m z — ihm y 
m x m y — m y m x — ihm z 


( 2 . 8 ) 
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which implies that the physical quantities m x , m y , m z cannot have 
definite values simultaneously (with the exception of zero values). 
Let us show that 

m 2 = m 2 -f m 2 -f tn\ (2.9) 

is an operator that commutes with each of the operators m x , m y , m z . 
We will call this new operator the square of angular momentum. 
We have 

m X “ m z m l = m x ( m x m z ~ m z m x) + ( m x m z ~ Vx) m x 

= — ih (m x m y -f m y m x ) 
m\m z - m z m 2 y = ih (m x m y + m y m x ) 
m\m z — m z m\ = 0 

Adding up the left- and right-hand sides, we get 

m 2 m z — m z m 2 = 0 

which in view of the symmetry between x, y, z is actually three 
commutation relations: 


m 2 m x — m x m 2 — 0 
m 2 m y — m y m 2 = 0 

m 2 m z — m z m 2 = 0 (2.10) 

Equations (2.10) express the physical fact that the square of 
angular momentum and any of the components of angular 
momentum can simultaneously have definite values. 

On the other hand, since each of the operators m x , m y . m z 
commutes with the Hamiltonian, the sum of their squares also 
commutes with it. Hence m 2 will be a constant of the motion: 

Hm 1 — m 2 H = 0 (2.11) 

We note that H is also a constant of the motion. Thus we have 
three operators, m z (say), m 2 , and H, which commute with each 
other and are constants of the motion. The general theory then 
states that we can choose the function that satisfies Eq. (2.1) so 
that it is simultaneously the eigenfunction of all three operators 
and hence is a solution to the following equations: 

m 2 $ = A,\J> 
m z $ = m'Jt 


i i* 


( 2 . 12 ) 

( 2 . 12 *) 

( 2 . 12 **) 
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3. Operators in spherical coordinates. 

Separation of variables 

Since in the problem under consideration the field is spherically 
symmetric, in studying the operators in (2.12) — (2.12**) it is 
convenient to introduce spherical coordinates r, 0, <p, which are 
related to Cartesian coordinates as 


x — r sin 0 cos <p, y = r sin 0 sin <p, 2 = rcos0 (3.1) 

Let us express the operators m x , m v , m z in terms of 0 and q>: 

mA =- T (y % - *-§£) = to ( sin q>If- + cot 0 cos <p■§£) 
m i *'“■T HI - * If) = ih (- cos «P'w + cot9 sin 

(3.2) 


Then *n 2 == m 2 -f m 2 -f- ni\ will operate on tj) in the following way: 


m 2 \J> = 


— h 2 (sinqi-^- 
-fl 2 (cosq)^ 


cos <p d \ . cos <p dtp \ 

tan 0 dtp ) V S,n ^ 00 laTT dtp ) 

Sln< P d Urn-ffi d * slnq) dt|> \ 

tan0 d<f ) ™ d6 tan0 dtp J 


or simply 



m 2 t|> = 


r i 

d 

L sinO 

dO 


( sin0 w) 


+ 


l a*tj> i 
sin 2 0 <9q> 2 J 


(3.3) 


What we have obtained is the differential operator that appears 
In potential theory in the study of the equation for spherical har¬ 
monics (or functions) 7/(0, q>): 

it ? r 4 -( sl " e : S ') + w ^+'<'+» 1 '-= 0 < 3 - 4 > 

where the integer / (1 = 0, 1,2,...) is the degree of the spherical 
harmonic. If we compare (3.4) with the eigenvalue equation for m 2 , 
(2.12*), we conclude that the eigenvalues of m 2 are 

\ = h 2 l(l+ 1), 1 = 0, 1,2,... (3.5) 


To find the new form of H we use the well-known expression for 
the Laplacian operator in spherical coordinates. This yields 


H\ j) = 


h* 


2m 


d*± 

dr* 


. z dif 
■* r dr 


, i r_j_i 

’ 7 1 " L sin 0 d& 

+ t/(r)t|) 


( sin9 w) + 


1 d 3 *-\\ 

70? 0 dtp 1 J ) 


(3.6) 
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We see that the right-hand side has the same derivatives with 
respect to 0 and <p as m 2 . As to the derivatives with respect to r, 
we can write them as 



where p* is an operator: 

<<•♦>-■7 (&+■?:) < 38 > 

which we can interpret, by analogy with p x = — ih(d/dx), as the 
operator of the radial component of angular momentum. If we 
use m 2 and p * 2 in H, we get 

tf = i (/ 2 + ^ m2 ) + f/(r) (3 - 9) 

This energy operator corresponds in form to the classical Hamil¬ 
tonian function in spherical coordinates. 

The eigenvalue equations for the simultaneous eigenfunctions of 
H and m 2 can then be written in the following form: 

tf* = ^(p ; 2 + ^m 2 )* + f/ ( rH = £i|> (3.10) 

'»*[lET K sl " 19 1) + lik = m «+» ♦ < 3 -"» 

By using (3.8) and (3.11) we can express (3.10) as 

<3>s> 

We note that Eq. (3.11) contains explicitly only the variables 0 
and <p, and Eq. (3.12) the variable r. For this reason we can look 
for the solution of these equations in the form of the product of 
a function depending only on r and a function depending only 
on 0 and <p. More than that, since if) must satisfy the wave equation 

H$ = Ey = ih^- (3.13) 

we can introduce the exponential factor e~ lEt,h and thus put 

i |5 = e~ <£< /*iJ>o (r, 0, <p) (3.14) 

where, as we have just said, 

r(r, 0, q>) = R(r)Y t (0, cp) (3.15) 

The function that depends on angles 0 and <p we have interpreted 
as a spherical harmonic of degree l since it satisfies Eq. (3.4), 
which coincides with (3.11). The radial function, R(r), must satisfy 
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Eq. (3.12), which we write in the form 

™ + l™-liLpi R + %. [E - U{r)]R = 0 (3.16) 

Thus knowing the laws of conservation of energy and of angular 
momentum enables us to separate the variables, that is, to reduce 
the solution of the wave equation in four variables (/, r, 8, q>) to 
the solution of simpler equations in a smaller number of variables. 

4. Solution of the differential equation 
for spherical harmonics 

We have seen that the eigenvalue equation (3.11) for the eigen¬ 
functions of the square of angular momentum coincides with 
Eq. (3.4) for spherical harmonics. Hence studying angular mo¬ 
mentum reduces to studying spherical harmonics. 

Let us find the simultaneous eigenfunctions of operators m, 
and m 2 . The spherical harmonic y/(0,q>) will be an eigenfunction 
of m* if it satisfies the equation 

-‘» s k=<y, «i> 

whose solution is 

Yi (0, q>) — 0 (0) 

To ensure that Yi is a single-valued function of position in space 
we must see that it is a periodic function in <p with period 2n. 
This in turn means that the eigenvalues m' are equal to 

m'= mh, m — 0, ± 1, ± 2, ... (4.2) 

(Incidentally, we came to the same result in Section 7, Chapter III, 


Part I.) Hence 

y,(e, <p) = 0(0)e<™p (4.3) 

Substituting into Eq. (3.4), we get 

wei -( si " 6 f )-^ r 9 +'('+ | ) e = 0 < 4 - 4 > 

If we introduce a new variable 

x — cos 0 (4.5) 

then Eq. (4.4) reads 

4-[(l-* ! )||]-T^re + /(/+l)e-0 (4.6) 


The singular points of this equation are x = ±1. Indeed, if we 
solve it for the second derivative, the coefficients will become 




Schrodinger's Theory 


167 


infinitely great at x = ±l. Next, if we consider / to be an 
undefined parameter, we can show that Eq. (4.6) has a solution 
that at x = ± 1 remains finite only if l is an integer. This implies 
that spherical harmonics are the only solutions of (4.6) that 
satisfy the stated conditions, that is, the only eigenfunctions of m 2 . 
Let us find the solution of (4.6) for integral /’s. 

First we consider the case m = 0. Put 

y = (x 2 - 1)' 

and take the logarithmic derivative with respect to y: 

y' 2 lx 

y ~ x 1 - 1 
or 

(l- x2 )-^ + 2lxy = 0 


We differentiate the last equation k 1 times with respect to* 
and put 


z — 


d k y d k 

dx k dx k 


(^-D { 


(4.7) 


This yields 

(l-x*)-£r-(2k-2l + 2)x£ + (2l-k)(k + \)z = 0 (4.8) 


If we put k — l, we get an equation that at m = 0 coincides with 
(4.6). The solution of this equation that at x — 1 is unity is 
denoted Pi{x) and is called the Legendre polynomial of degree l. 
It differs from (4.7) at k = / only by a constant factor. If we 
determine this factor from P t (l) = 1 , we get the Rodrigues for¬ 
mula 


Pi(x) = 


1 d l 
2‘H dx l 


(* 2 -l)' 


(4.9) 


Hence the polynomial satisfies the equation 

-17 [U - 1&] + 1 V + 0 p i = 0 (4. : 1°) 

which is a particular case of (4.6). 

Let us now consider the general case m 0. We introduce the 
substitution 

0 = (1 — x 2 ) m,2 v (4.11) 

which gives the following equation for v: 

( l - x2 )-^ — ( 2m + 2 ) x -37 + ( l — m )( l + m + l ) v=!=0 (4.12) 
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If we had put 

0 = (1 - x t)~ mn w (4.13) 

instead of (4.11), we would have had an equation for w that 
differs from (4.12) only in the sign of m, namely 

(\~x 2 )^- + (2m-2)x-^- + (l + m)(l-m+l)w = 0 (4.14) 

Both equations, (4.12) and (4.14), are of the type of Eq. (4.8), and 
for (4.12) integer k is l + m, whereas for (4.14) it is l — m. For 
this reason we must put 

(4.15) 

(4-16) 

By equating the two substitutions for 0, (4.11) and (4.13), we 
get 

e = C ,(l-Ar*)“' s ^-(A4-l)' 
-c,<l-*r" , -^ r <*»-!)‘ (4.17) 


To find the ratio of the two constants, Ci/c 2 , it is sufficient to 
equate (4.11) with (4.13) for any particular value of x. Compu¬ 
tations yield 

c, (/ + m) 1 — c 2 (- l) m (/ - m)\ (4.18) 


Usually ci is chosen such that 


which gives 




(-D m 


(/ + m)l 
(/ - m)l 


(4.19) 

(4.20) 


Also, the corresponding solution to Eq. (4.6) is denoted by PT(x). 
Thus 


PT (*)■=( 1 


2 yn/2 d l+m ( x 2 - l)* 
* ' dx l+m 2 l l\ 


(4.21) 


which is also 

n m t..\ / t v 


m (l + m)l f t „2^-m/2 d l m (x 2 — l) 1 /A 
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These functions satisfy the equation 

-L [ ( i _ **) ^g!.] _ P? + 1(1+ i) P? - 0 (4.23) 

and represent the solutions that at x — ±1 are finite. (They are 
sometimes called the associated Legendre polynomials of the 
first kind.) 

Formulas (4.21) and (4.22) give P? (x) for both positive and 
negative values of the integer m, and from comparing (4.21) with 
(4.22) it follows that 

PT m (x) = (- l) m P? (x) (4.24) 

Also, (4.21) and (4.22) at |m|> l vanish, which implies that for 
this case there are no solutions of (4.6) that at x = ± 1 remain 
finite. Hence m (for a fixed /) can take on only the values 

m=-l, -/+ 1,...,/- 1, l (4.25) 

altogether 21 -f- 1 values. The inequality 

|m|<f (4.26) 

stems from the physical meaning of these quantities. Indeed, to 
within a factor h 2 , the quantity m 2 is an eigenvalue of operator m*, 
and /(/+1) is an eigenvalue of m 2 = m 2 x + m 2 y + m 2 z . This re¬ 
sults in that 8 

m 2 </(/+l)<(/+4-) 2 

which yields 

\m\<l -f-i- 

Since |m| and / are integers, the above inequality is equivalent 
to (4.26). 

Recalling the Rodrigues formula, (4.9), we can represent P? 
with positive m, in the form 

P ? = (1 - XT' 2 Sr P , ( x ), m > 0 (4.27) 

In potential theory m is usually taken to be positive, and (4.27) 
Is considered the definition of Pf. 


‘ It 4> is simultaneously the eigenfunction of m J and m z , TPl (l + 1) =» 
= ^ ♦ ( m x + m l + m l) tydT= h 2 m' 2 + J (m 2 + m 2 ) ip dx > h 2 m' 2 . 
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When we used Eq. (4.6), we assumed that m was an integer. 
In some problems connected with Pauli’s and Dirac’s theories of 
the electron, however, m can be a half-integer. The function in 
(4.1) then will not be a single-valued function of position, since 
it will change sign when <p increases by 2 jt. But the expressions 
(4.17) for 0 retain their meaning when both l and m are half¬ 
integers, which means that even in this case we can use them. 


5. Some properties of spherical harmonics 

Henceforth we will need to use various properties of spherical 
harmonics. So let us consider these properties in detail. 

Recall the Cauchy formula 

pi) 

for the /th derivative of an analytic function. If we put 

f(z) = ^~ - )l 

we can represent the Legendre polynomial, defined by (4.9), in 
the form of an integral: 

P t ( X ) = -L J_ f dz (5.2) 

2 l 2ni ' (z-x) l+l 


Now we introduce a new variable £: 


Solving this for z and taking the root for which z — x at £ — 0, 
we get 


It follows that 


2=|[1-(1-2xS + S 2 )' / ‘] 

dz __d£_ 

J(1 — -+- £ 2 ) ,/j 


which turns (5.2) into 

P 1 [ i ^ 

1 2ni ' (1 - 2*£ + E 2 ) 7 ’ £ ,+1 

If we then apply the Cauchy formula (5.1), we get 

o / v 1 / d l 1 \ 


P ' W /! (d£* (1 -2xE + C*) ,/ *)t 
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Hence the Pi(x) are the expansion coefficients in the Taylor series 


1 

( 1-2 xr + r 2 )'h 


= Y J flp l {x) 

i-o 


(5.5) 


which is convenient for deriving various properties of the Le¬ 
gendre polynomials. 

For instance, let us differentiate (5.5) with respect to r. We get 


_ x — r 

(1 - 2 xr + r 2 )' 1 ' 


5>'' ,p <(*) 

1-0 


(5.6) 


If we then multiply (5.6) by 2 r and add the product to (5.5), we 
get 


1 -r 2 

(1 - 2 xr + r 2 )' 1 ' 


= £(2/+l)r'/> z (x) 
i-o 


(5.7) 


On the other hand, if we multiply (2.6) by r 2 and (2.5) by r and 
add the two products, we find that 


(1 - 2 xr + r 2 )' 1 ' 


Yj (*) 


Z-0 


(5.8) 


But the sum of (5.6) and (5.8) is equal to (5.7) multiplied by x: 

oo oo oo 

jy (*+1) p '+i <*>+Z rHPi -»w=Z r< (2/ +(*) 

1=0 /=o z=o 

Identifying powers of r finally yields 

(2/ + 1) xP z (x) = (/ + 1 )P l+l (x) + (x) (5.9) 

What we have arrived at is a recursion relation that enables us 
to find Pi+i (x) if Pi(x) and P/-i(x) are known. 

Now we differentiate (5.5) with respect to x and divide the 
result by r. We get 

1 = V r z dPl+l 

(1—2 xr + r 2 )' 1 ' dx 


If we then multiply this expression by 1 — r 2 , we will obtain 


1 -r» 

(1 - 2 xr + r 2 )' 1 ' 


OO 



/ dPi+i 

dPt-i 

l dx 

dx 


(5.10) 
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Comparison of (5.10) and (5.7) yields another property of the 
Legendre polynomials: 

(2/+l)P / (x) = -^±L-^fi (5.11) 

We will generalize (5.9) and (5.11) to include spherical har¬ 
monics (more correctly, the associated Legendre polynomials of 
the first kind). If we differentiate (5.11) m times with respect 
to x and then multiply the result by (1 —x 2 )< m+I)/2 , with the help 
of (4.27) we can write the recursion relation 

(2/ + 1) (1 - * 2 ) v ‘ P? (x) = PT+V (x) - PT-V (x) (5.12) 

On the other hand, if we differentiate (5.9) m times with respect 
to x and multiply the result by (1 — x 2 ) m/2 , we get 

(21 + l)xP? (x) + (2l+l)m(l - x 2 )' u P?- 1 (x) 

=*(l+l)P7 +l (x) + lPT-i(x) 
The last step is to substitute P7+i (x) — P7-i (*) for (21 + 1) X 
X(l — x 2 )' 1 ’P?~ l (x) by using (5.12). This yields 

(21 -f 1) xP? (x) = (l-m+ 1) P ?+, (x) + (l + m) P7-\ (x) (5.13) 

which interrelates three spherical harmonics in succession with 
the same value of m. 

We note that (5.12) and (5.13) hold for both positive and ne¬ 
gative values of m. If the absolute value of the upper index of 
the spherical harmonic is greater than the lower index, the spher¬ 
ical harmonic vanishes. 

Finally, let us elaborate on the system of differential equations 
for spherical harmonics since we will need to use it when we 
come to Dirac’s theory of the electron. 

To start with, we multiply (4.22) by (1—x 2 ) m/2 and differen¬ 
tiate with respect to x. We write the result as follows: 

^[(1-xfffWl 

- - V + m) v - m + 1) (1 - FT-' <*) 

or, if we change tn to m-f 1, 

-£[(i -* 2 ) (m+, > /2 Pr +, w] 

= - (/ + m + 1) (/ — m) (1 - xT* P7 (X) (5.14) 

By multiplying (4.21) into (1 — x 2 )~ m/2 and differentiating with 
respect to x, we get 

C(l - x'Y mn P7 (*)] = (l - x 2 )“ (m+1)/2 Pr +1 (x) (5.15) 
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The two equations, (5.14) and (5.15), constitute a system of 
equations, which, after excluding P7 +i (x), yields an equation for 
P? (x), (4.23). Let us replace x in (5.14) and (5.15) by a new 
variable: 0 = arc cos x. The equations then read 

-^[(sinef +1 Pr‘(cos 0)] 

= (/ + m + 1) (/ - m) (sin 0) m+1 P? (cos 0) (5.16) 

-gg- [(sin Q)- m PT (cos 0)] = - (sin 0)" m P? +> (cos 0) (5.17) 

or, If we differentiate explicitly, 

-%■ P? (cos 0) - m cot 0 P? (cos 0) = - Pf +1 (cos 0) (5.18) 

P? +i (cos 0) + (m + 1) cot 0 PT +i (cos 0) 

= (/ + m + \)(l — m)P? (cos 0) (5.19) 

We will return to these equations in Dirac’s theory. 

6. Normalized spherical harmonics 

The functions P?(x) of the previous sections constitute a com¬ 
plete set of eigenfunctions, the eigenfunctions of a hermitian op¬ 
erator on the left-hand side of the equation 

--^[(l-^HJ + j^-e^/fz-f i)6 (6.i) 

They are orthogonal 

\p7(x)P?'(x)dx = 0, l^l' 

-l 

but not normalized. We introduce new functions 

P'r(x) = c lm P?(x) (6.2) 

and normalize them so that. 

+i 

y $ P\ m (*) (x) dx = t>u' (6.3) 

We have still to find the factors c im . But we see that 


(6.4) 
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To evaluate the integral we substitute the product of (4.21) and 
(4.22) for P?. We get 

2 / n m (l + m)\ _ 1 _ f' d l ~ m (x r 2 -l)' d‘ +m (x 2 - \)‘ . 

( c lmf ' ' (/ — m)I (2 l ll) 2 _j dx l ~ m dx‘ +m 


If we then integrate the last expression l — m times by parts and 
note that 

— (2/)l 


we arrive at 


dx u 


( c lmf 

This integral is easily found; it is 


(/ + w)l - ( f' 

(/ — m)l (2'/l) a _J ’ 


Hence 


+i 

$ (1 -x 2 ) l dx-. 


(2 , /l) J 


2 

( C ,mV 


+1 


2/ + 1 (2/1) 

2 (l + m )! 


(6.5) 


y 5 [ P ‘ ^ dx 2 / 1 (l- m)l 
' -\ 

which implies that 

The normalized spherical harmonics are 

p\ m ( X ) = (21+ i) v - ) Vl p? (x) 

Let us express these functions through the derivatives explicitly. 
Recalling (4.21) and (4.22), we get 


( 6 . 6 ) 


(6.7) 


P ;. w=(2 , + 1 ).(^Ey.(,_/r^ 


d l+ m (,*_,)! 


2 l l\ 


( 6 . 8 ) 


and 


P? W -(-!)“ (2/ +D'‘ 0 - 


d l ~ m (x 2 —\) 1 


2‘/l 


(6.9) 


Hence we see that 


P'r m (x) = (- 1 ) m P\ m (x) 


( 6 . 10 ) 
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For the normalized spherical harmonics the recursion rela¬ 
tions (5.12) and (5.13) read 

*p'r ( iy.|! l i 1 ) r_”i )'''f3iw (6.H) 

0 - xTp; m (x )=- ( (; ~ iZ/rr ~ 11 )* p;y w 

+( " + "«+1“-7 +21 )* p ‘+?' w 

In conclusion let us list some of the Legendre polynomials and 
the functions P] m (a:): 

P 0 (x) = l, Pi(x) = x, P 2 (x)=i( 3 ^- 1 ) 

P 3 (x) ^-jfSx 3 -3x), P, (x) = | (S5x* - 30a: 2 + 3) 

Vt (1 -* 2)Vi 

p? w=V? (i - jc2) - p 2‘w=V? (1 -* 

P? w = Vf (1 ” x2)V, ‘ p 3 2 w = V?" 

w = Vii n-^) v, ( 5 ^-i) 

7. The radial functions. A general survey 

Let us consider differential equation (3.16) for the radial func¬ 
tions, which for the sake of convenience we will restate 

^- + T^-- 1SL p lp + ^-[ E -U(r)]R = 0 (7.1) 

To analyze this equation we must assume a certain behaviour 
of the potential energy for small and great distances from the 
nucleus. We start with the case of great distances. Assume that 
when r —*■ oo the potential energy can be represented as 

U(r) = -- f+ £+... (7.2) 

The first term on the right-hand side, — A/r, is the Coulomb field, 
which acts over great distances. Coefficient A for a valence 
electron is 

A = ZV 


with Z* e the effective charge of the nucleus (the algebraic sum of 
the charges of the nucleus and the inner-shell electrons), which 
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implies that A is positive (attraction). The wave equation for 
an a-particle will have a negative A (repulsion). 

Let us try to find the behaviour of the solution at large values 
of r. For this we assume that 

/? = r^(l+ 7 -+ ...) (7.3) 

where the omitted terms are of the order of 1/r 2 and higher. The 
other terms in Eq. (7.1) are 

7^- = r B ear ( a 2 + 2afi±£5l _|_ ...) 

7TT —*-(*+•••) 

^■(£+ 4 + ...)R = r>e^ip-(E+Sl±* + ...) 

If we then substitute these expressions into the equation and divide 
r$e ar out, we get 

a 2 + ^-E + [2(p+l)a + ^^H-C(a 2 +fi-£)]l+...=0 
from which we derive two equations 

2 (P + 1) a + -fir ^4=0 (7.4) 

for the constants a and p. They have two solutions 

a = (—2 mE/h 2 )' 1 ' (7.5) 

P = - 1 + Aa/(2E) (7.6) 

corresponding to the two possible signs in (7.5). If we choose one 
sign, the leading terms of the solution to (7.1) are 

R = 7 { Ci exp [o (r + log r)] + C 2 exp [- a (r + log r)] } 

(7.7) 

We see that the behaviour of the solution is different depending 
on whether a is real or imaginary. 

If the total energy, E, is positive (which in classical mechanics 
corresponds to orbits that extend to infinity), a is pure imaginary: 

E> 0, a = i(2mE|h 2 )' ,, (7.8) 

Then the general solution of (7.7) will, as r-*- 00 , alternate signs 
and tend to zero like 1/r. However, it will fall off so slowly that 




the integral 


$ r*\R(r)\ 2 dr (7.9) 

r o 

with r 0 a certain finite constant, will diverge. 

If the total energy is negative, a is real (we will assume it to 
be positive): 

£ < 0, a = | (— 2m£/ft 2 )' / * | (7.10) 

and the behaviour of the solution depends on whether C i vanishes. 
If Ci =#= 0, then (7.7) will, as r-*- oo, grow infinitely. But if C] is 
zero, R will fall off at infinity according to the exponential law 
and (7.9) will converge. 

What remains is to study the case when £ = 0. We must look 
for the solution in a somewhat different form, namely 

* = ^ + -|. + ...) (7.11) 


Proceeding as we did above, we get 

a? 2mA f / 3 \ /a? 2mA \ "1 1 

f+ —+ h(f‘' + T) + c lT + ^JJ-+'- 

which implies that 

a l = 2(-2mAlh 2 )' 1 ' 

Pi = -74 

Hence the general solution in this case is 

R = r- >l {c[e a ' r ' l> + C' 2 e~ a < r ' h ) 


«=0 


(7.12) 

(7.13) 


(7.14) 


If A > 0 (attraction at great distances), cti is pure imaginary 
and R will be finite. If A < 0 (repulsion), R will, generally speak¬ 
ing, grow as r-> oo. 

We now turn to the case of r small [in Eq. (7.1)]. Let us assume 
that at r = 0 the potential energy grows no faster than 1/r: 

U (r) — —-f- (finite function for r = 0) (7.15) 

This corresponds to a Coulomb field at small distances from the 
nucleus. The coefficient Ai can differ from A in (7.2). For a valence 
electron 

Ai — Ze 2 


with Ze the charge of the nucleus. 
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We seek the solution in the form 


R — r a -\- Cr a+l + • • • (7.16) 

Substituting (7.16) into (7.1) and equating with zero the term 
that has r in the lowest power, we get 

o(a+ 1)-/(Z+ 1) = 0 

which simply means that 


a = l or a — — l — l (7.17) 

The general solution of our equation will have the following form: 

R = C'r‘(l +a'r+ ...) + C"r~ l ~' (1 + a"r + ...) (7.18) 

Thus at r — 0 the solution does not depend either on E or on 
the expansion coefficients of U(r) in (7.15), and we must put 
C" — 0 if we want our solution to remain finite at r = 0. 

Combining this result with the one obtained for large values 
of r, we come to the following conclusion. 

At E > 0 all solutions, including the one that remains finite at 
r = 0, vanish as r-*-oo. Then to find a solution R(r) that is 
finite everywhere it is sufficient to take the solution that at r = 0 
is finite, which means that at E > 0 the Hamiltonian has a con¬ 
tinuous spectrum in the interval from 0 to oo (the eigenvalue 
E = 0 lies in the continuous spectrum only for attractive forces). 
At the same time for E ^ 0 the integral (7.9) is not finite. This 
implies that at E 0 there cannot be a discrete spectrum, since 
eigenfunctions belonging to a discrete spectrum are square inte- 
grable. 

Now what happens when E < 0? The solution that at r = 0 
remains finite passes, as r-*- oo, into an expression of type (7.7) 
with a real exponent a. The ratio Ci/C 2 is determined by the value 
of E. Two cases are possible here. In one this ratio is finite and 
R(r) tends to infinity as r-*- oo, which means that the correspond¬ 
ing value of E is not an eigenvalue of the Hamiltonian. In the 
other this ratio is zero and R(r) falls off at infinity so rapidly 
that integral (7.9) is finite. The corresponding E = E n is an eigen¬ 
value belonging to the discrete spectrum, and we conclude that 
at E <. 0 there is no continuous spectrum, and either there is a 
discrete spectrum or there is no spectrum at all. The first case 
holds for attraction, and the second for repulsion. 

Thus for attractive forces the eigenvalue spectrum of the Hamil¬ 
tonian consists of a collection of negative numbers 

£,, E 2 , .... E n , ... (the discrete spectrum) (7.19) 
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and a continuous set of values 

0 ^£<00 (the continuous spectrum) (7.20) 

Since the radial equation contains / as a parameter, the eigenvalues 
belonging to the discrete spectrum also depend on /, and we will 
denote them by £„;. For the corresponding radial functions we 
will use Rni for the discrete spectrum and R E i for the continuous 
spectrum. 

Let us consider an electron that is in a state with definite energy 
and angular momentum. According to the probabilistic interpre¬ 
tation of the wave function, the relative probability of the electron’s 
being (after the proper measuring process has been completed) 
at a distance between r and r + dr is | R n i(r) \ 2 r 2 dr for the discrete 
spectrum (£„/ < 0) and |/?£/(r) \ 2 r 2 dr for the continuous spectrum 
(£ > 0). In classical mechanics, on the other hand, negative en¬ 
ergies correspond to closed orbits and positive energies to orbits 
that extend to infinity. Using the classical-analogy concept, we 
can expect that for the discrete spectrum the probability of find¬ 
ing the electron at great distances from the atom is considerably 
smaller than for the continuous spectrum. Indeed, if we recall the 
properties of the radial functions, we see that |/?«/| 2 r 2 falls off at 
infinity according to the exponential law, whereas |/?£*| 2 r 2 usually 
remains finite. 


8. Description of the states of a valence electron. 
Quantum numbers 


Let us summarize what we have learned about an electron mov¬ 
ing in a field with central symmetry (a valence electron in an 
atom). Such electrons can be described by a wave function 

*nim = e- iE ni tlh R nt (r)Y ln (Q, <p) (8.1) 

for the discrete spectrum, and by a wave function 

fern = e- iEt ' h R El (r) Y lm (0, q>) (8.2) 

for the continuous spectrum. We assume the radial functions to 
be normalized. Namely, for the discrete spectrum 

00 

JI Rni ( r ) f r 2 dr —\ (8.3) 

0 


and for the continuous spectrum 


lim 

A£-*0 


£+A£ 


te S S * £/(r) 


dE 


r 2 dr = 1 


(8.4) 


12* 
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The spherical harmonic K (m (0, <p) is expressed, according to the 
results of Sections 4 and 6, in the following way: 

Ylm (0, <P) = 7i^vT e<m<pp * m < cos e ) (8-5) 

and the normalization condition is 

n 2n 

^ ^ I F Zm (0, q>) psin0d0 dq> = 1 (8.6) 

e-o p-o 

Functions (8.1) and (8.2) are the simultaneous eigenfunctions 
of the Hamiltonian H, the square of angular momentum m 2 , and 
the component of angular momentum m 2 along the z axis. This is 
why in a state described by either (8.1) or (8.2) these three quan¬ 
tities have definite values simultaneously: 

quantity eigenvalue 

H E n[ or E 

m 2 1(1 +l)h 2 

m z mtl (8.7) 

Thus a state is characterized by three quantum numbers, n, l, 
and m, or by one continuous parameter, E, and two quantum 
numbers, / and m. The label n is called the principal quantum 
number-, it is usually defined as the sum 

n = n r + l + 1 (8.8) 

where n r determines the number of zeros (or nodes) of R n i(r). 
The label n, is called the radial quantum number, and l the azimu¬ 
thal quantum number. Such a definition for n is possible because 
when an eigenfunction of a differential operator of type in (7.1) 
belongs to a discrete spectrum, it is characterized by the numbers 
of its nodes. Since n r is nonnegative, the principal quantum num¬ 
ber is greater than the azimuthal quantum number at least by 
one. 

If we study Eq. (7.1), we see that m is not included. This means 
that energy levels E n i do not depend on m (in other words, the 
energy state term does not determine the value of m). This was to 
be expected since mb is the value of the angular momentum com¬ 
ponent along the z axis, and in the case of a central field the 
direction of this (and any other) axis is undefined from the phys¬ 
ical point of view. But if a magnetic field is applied to the system 9 
(we call the direction of this field the z axis), the energy levels 

8 As we noted at the beginning of this chapter, we must use the Dirac equa¬ 
tion in the case of a magnetic field. 
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will depend on m as well. For this reason m is called the mag¬ 
netic quantum number. 

The Coulomb field has a specific property, which we will study 
in the next chapter — the energy of an electron in such a field 
depends only on n. 

For the general case of a central field each energy level E„i 
has corresponding to it 21 + 1 eigenfunctions, which can be found 
if in ( 8 . 1 ) we put 

m — — l, —/ + 1, .... I- 1, l (8.9) 

Because of this the energy level is (2/+l)-fold degenerate. In 
the case of a Coulomb field the degeneracy is greater, since for 
a given n the azimuthal quantum number, /, can take the values 

/ = 0, 1, .... n- 1 (8.10) 

and to each value of / there correspond 21 + 1 different values 
of m. The total degree of degeneracy is the sum 

1+3 + 5+ ... +2/i-l=n 2 (8.11) 

There is an accepted spectroscopic notation for energy state 
terms. Terms with the same value of n but different values of l 
are denoted by Latin letters s, p,d,f,g, ... . For instance 


term 

spectroscopic n 

n= 1, 1 = 0 

(Is) 

n = 2 , 1 = 0 

( 2 s) 

n = 2 , l=\ 

(2 p) 

n = 3, 1 = 0 

(3s) 

n = 3, l=\ 

(3p) 

n = 3, 1 = 2 

(3d) 


We note that according to Schrodinger’s theory no spectral 
term is degenerate, whereas in reality all terms except the s terms 
(corresponding to / = 0 ) are doublets, that is, consist of two 
adjacent terms (the fine structure of spectral lines). As we will 
see in Part V, Dirac’s theory of the electron provides an explana¬ 
tion of this phenomenon. 


9. The selection rule 

Not knowing the exact form of the radial functions, we cannot 
calculate the elements of Heisenberg’s matrices, which according 
to the results obtained in Section 3, Chapter III, characterize the 
intensity of the spectral lines corresponding to various transitions. 
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However, we can state which of these matrix elements vanish, 
because we know the angular dependence of the eigenfunctions. 
In other words, we can evolve the selection rule. 

First we must generalize the formulas for the intensities de¬ 
rived in Sections 3 and 4, Chapter III, to the case of several 
quantum numbers and degenerate energy levels. We recall that 

(i *„„■ p+1 p+1 *„• p) <9-i) 

for the discrete spectrum, and 

I n (E) A E = (£) [ | (£„ \x | E) | 2 + | (£„ | y | E) | 2 

+ \(E n \z\E)?]lxE (9.2) 

for the continuous spectrum. In our case the state of an electron 

is given by specifying three quantum numbers. Hence the elements 
of Heisenberg’s matrix for x will be 

Xnn' = (nlm | x | n'l'm') (9.3) 

for the discrete spectrum, and 

(E n | x 1 E) — (nlm | x | El'm') (9.4) 

for the continuous spectrum. The frequencies gw and g>„ (E) are 
understood to be respectively 

©nn' === (Enl E n 'l')/fl (9.5) 

©n (E) — (E n i — E)/h (9.6) 

or, more correctly, the absolute values of these quantities. 

Transitions that differ in the values of the quantum numbers m 
and m' may have the same frequencies. In normal conditions 
(without a magnetic field) such transitions cannot be distinguished. 
What is observed is the sum of the intensities of all the transitions 
that have the same frequency. Thus we must change 1 aw p in (9.1): 

|awP-> Z Z I (nlm | x | n'l'm') | 2 (9.7) 

m--/ 

The same holds for y and z. In the continuous spectrum the values 
of the energy parameter form a continuum, which means that one 
cannot specify the quantum numbers /' and m’ by simply stating 
the value of E. Hence we must change | (£■« | jc| i?) | 2 in (9.2): 

|(£Jx|£)p-> Z Z Z 1 (nlm\x \El'm') P (9.8) 

m ~-l l'~o m-—l' 

We note that because of the selection rule, which we derive below, 
the sum (9.8) contains only a finite number of terms. 

With the given changes formulas (9.1) and (9.2) hold for the 
considered general case. 
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If magnetic field is present, one can distinguish the transitions 
with different values of m. In this case the separate terms in (9.7) 
may be of interest. 

We now turn to the actual evaluation of the elements of Heisen¬ 
berg’s matrices corresponding to the coordinates x, y, z for the 
discrete spectrum, namely 

C film | x | n'l'm') — ^ ^ r sin 0 cos <p sIn ® dQ dq> dr (9.9) 

(ntm | y 1 n'l'm') = ^ ^ r sin 0 sin <p sin 0 dQ dtp dr (9.10) 

(nlm | z | n'l'm') = J ^ r cos 0 sin 0 dQ dtp dr (9.11) 

where because of (8.1). 

^nlm^nTm’ = ^n'l’^lnX t'm' ^ ^) 

or 

LW-<“Uri( 9 . 13 ) 

with © being actually as in (9.5). 

Each of the triple integrals factors into three simple integrals, 
and the integrals with respect to r in (9.9), (9.10), and (9.11) are 
the same, namely 

no 

r(nl-, n'l')=\~R n ,R n 'i’r 3 dr ( 9 . 14 ) 

o J 

We denote the integrals with respect to 0 and <p as follows: 

(Im | sin 0 cos q> | I'm') — 4 ^ J S P\ m P*”' el {m '~ m) * sin 2 0 cos <p d 0 cf<p 

(9.15) 

(Im | sin 0 sin <p 1 I'm') = ^ <p sin 2 0 sin <p dQ dtp 

(9.16) 

(Im | cos 01 I'm') = J <"*'-"*> * sin 0 cos 0 dQ dtp (9.17) 

Thus the matrix elements for x, y, z will be equal (apart from 
a constant factor e i(0t ) to the products of (9.14) by (9.15), (9.16), 
and (9.17) respectively. For a continuous spectrum all the for¬ 
mulas will remain the same with one obvious exception: we must 
change R n r to Rei’. Then 

00 

r (nl; El') = $ R n iR E rr 5 dr 
0 


(9.14*) 
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Let us evaluate the integrals in (9.15), (9.16), and (9.17). Since 
these integrals are factors in (9.9), (9.10), and (9.11), we see 
that if for some values of 1, m, l', m' they vanish, the correspond¬ 
ing elements of Heisenberg’s matrices vanish too. This is the se¬ 
lection rule. 

We start with integral (9.17) as the simpler one. Integrating 
with respect to q>, we see that it is nonzero only if m = rri since 

2n 

_L J e i * d(f = fl mm , (9.18) 

0 

To evaluate the integral with respect to 0 we Introduce a new 
variable 

x = cos 0 

Then (9.17) reads 

i 

(Im 1 cos 0 | I'm') = 6mm' J 5 p "r (*) P T (*) xdx (9.19) 


If we use the recursion relation (6.11), we can change 
xP\ m (x) to 




ry)‘ A PR, W + (fcf )’'■ rS. W (9-20) 


Bringing in the orthogonality and normalization conditions for 
P* m (x) yields 

(Im | cos 0 | I'm') = 6mm, [( U-lV-l ) 1/> 6l+i p + 

+ (-l^f)‘ /, 5w' +1 ] (9.21) 

Hence a matrix element does not vanish only if l — l' — ±1. 

To evaluate the integrals (9.15) and (9.16) it is convenient to 
build their linear combination 


(/m 1 sin 0 e l(f | I'm') = (Im 1 sin 0 cos tp) I'm') 

+ i (Im | sin 0 sin q> I I'm') (9.22) 


which we can use to find the expressions for (9.15) and (9.16) in 
the following way: 


(Im 1 sin 0 cos <p 1 I'm') — 
(Im | sin 0 sin cp | I'm') = 


y [(Im | sin 0 e iv | I'm') + (I'm' I sin 0 e i<p | Im)] 

(9.23) 


[(Im | sin 0 e l<t \ I'm') 


(I'm' 1 sin 9 e' <l> | Im)] (9.24) 
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In explicit form, (9.22) is 

(Im 1 sin 8 1 I'm') = ^ J J P’^P^'e 1 '•*-•*+» <p s in 2 8 dQ dcp (9.25) 

If we integrate with respect to <p and introduce the variable 
x = cos 8 , we get 

i 

(Im | sin 8 \I'm') = fl m m - + , 1 J (1 - x 2 )' 1 ' P'” (x) P?-' (x) dx (9.26) 

-l 


By changing m to m — 1 in (6.12) we find that 

(i - pT' (x) ■ ( v ”, + " )* pfr i to 


»-27) 


which we then substitute into (9.26). Applying the normalization 
and orthogonality conditions for P] m (x) and also expressions of 


type 
we get 


/ (O r-i“/</+!) */!'-» 


(Im | sin 8 e* \ I'm') = 6 m _, „ [( ( ' + V- i + ^ T &l ~ l l ’ 


-( 


(/ — m + !)(/ —m + 2) \V. 


4(/+ l) 2 — 1 


) r] (9.28) 


whence 


(I'm' |sin 8 e'* |/m) - 6 m+1 „ bl+u . 


( 


(/ — m) (l — m — 1) y/i 


4/ a — 1 


) 6 i-w'] (9.29) 


What remains to be calculated is the half-sum of (9.28) and (9.29) 
and the half-difference divided by i. According to (9.23) and 
(9.24) this gives us the matrix elements (9.15) and (9.16). 

We see that (9.15), (9.16), and (9.17) and hence the matrix 
elements for x, y, z are nonzero only if 

l-l' = ±l (9.30) 

which is the selection rule for the azimuthal quantum number, /. 
The rule states that transitions between s and p spectral terms, 
p and d terms, etc. are possible, whereas transitions between, say, 
s and d terms are forbidden. This agrees with experimental data. 
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As to the magnetic quantum number m, the elements of the 
matrices for coordinate z do not vanish only if 

m — m' — O (9.31) 

and the matrix elements for x and y do not vanish if 

m - m' = ± 1 (9.32) 

These two conditions are the selection rule form. Transitions 
that satisfy (9.31) produce light that is polarized along the z axis, 
whereas those that satisfy (9.32) produce light that is polarized 
in the xy plane. As we have already mentioned, transitions that 
correspond to definite values of m and m' can be observed only 
in the presence of a magnetic field (directed along the z axis). 

We present the matrix elements (9.15), (9.16), and (9.17), 
which correspond to various transitions, in table form: 


l' = l-\ 


(lm | sin 6 cos <p 1 1 — 1 m — l) = y ( + m 4[ i ^ + m) ) ^ 

(Im | sin 0 cos <p 11 — 1 m + 1) = — y (-— —J i —y -—) 

(lm | sin 6 sin <p 1 1 - 1 m — 1) = — — ^- J 

(lm 1 sin 0 sin <p 1Z — 1 m + 1) = — -j ( -- j W -— j 

(lm | cos 01 / — 1 m) = U 


'It 

'It 

'h 


l'= 1+1 


,, , . „ ,, , , ,, 1 ( (l — m + I) (/ — m + 2) \‘h 

(lm | sin 6 cos q> | / + 1 m - 1) = - -j (^- ^ - 1)2 _ j -J 


1 / (/ + m + 1) (/ + m + 2) \'l. 


i 


y 


(/m | sin 6 cos <p | / + 1 m + 1) — 2 ^ 4 + j)2 _ j 

,, ■ • a ■ >, . 1 <s < ( (l — m + 1) (/ — m + 2) y/« 

(lm 1 sin 0 sin <p 1 1 + 1 m - 1) = j { - 4 (i + if - \ - ) 

(lm | sin 0 sin q> 1 / + 1 m + 1) = j ( (/ - + ™ ^ ^^-7 + ^ ) '* 

(lm | cos 61 1 + 1 m) = ( y+; ) 1) 7_ m , ) k 
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Let us now build sums of type (9.7), which we will need for 
the intensities. With the help of the relationship 


i 


£ m 2 = -|-/(/ + 1 ) (2/ + 1 ) 

m*= — l 

(9.33) 

we easily find that 


I (Itn | cos 9 | / — 1 m') P = y / 

in, m' 

(9.34) 

and by analogy 


| (Itn | sin 9 cos <p | / — 1 tri) P = y / 

(9.35) 

m. m' 

and 

\(lm | sfn 0 sin q>! /— 1 m') P = y l 

m, m' 

(9.36) 


All three sums have the same value //3, which was to be expect¬ 
ed since after we exclude m (in the summing process) any 
specific direction in the system becomes meaningless and all 
three axes play the same role. The sums with /' = / + 1 can be 
found if we change l to l + 1, which gives (l -f- l)/3. 

Our results make it possible to find the general formula for 
intensities. In the discrete spectrum the transition intensities are 

/ ( nl\ n'l- 1 ) = g*( E —"x"-- )* I r (til-, n' l - 1 ) P / (9.37) 

/ (til', n' l + 1 ) = e 2 ) 4 1 r ( til, n' l + 1 ) P (l + 1 ) (9.37*) 

To find the intensities of transition from the continuous spec¬ 
trum to the discrete one, we must take the sum of the corre¬ 
sponding expressions and multiply it by A E: 

I (nl; E)\E — e 2 [ I r (nl; El-\)?l 

+ \r(nl\ El+l)Hl+l)]bE (9.38) 

Finally, in the case of a Coulomb field we must sum with re¬ 
spect to l since in this case E n t does not depend on /. 




Chapter V 


THE COULOMB FIELD 


1. General remarks 

Here we will examine a particular case of the general problem 
considered in the previous chapter — the state of a particle that 
is attracted to or repelled from a fixed centre (the nucleus of an 
atom) via the Coulomb law. This is an interesting case because, 
on the one hand, certain important physical problems (the hydro¬ 
gen atom, for instance) reduce to it, and, on the other, it permits 
an exact solution. The results that we obtained in Chapter IV, 
naturally, can be applied in full measure to the case of a Coulomb 
field. For instance, the separation of variables, the angular de¬ 
pendence of the wave function (spherical harmonics), and the se¬ 
lection rule can all be used. More than that, in the case of a 
Coulomb field it is possible to solve the radial equation exactly 
and thus find the energy levels and the intensities and frequencies 
of the spectral lines, thereby completing the solution. 

This solution is simple enough to use as the starting approxi¬ 
mation when a constant electric field disturbs a hydrogen atom 
(the Stark effect). We will consider this problem too. 

Finally, the theory of the motion of a particle repelled via the 
Coulomb law makes it possible to deduce the Rutherford law for 
the scattering of a-particles from nuclei. This theory is also an 
interesting illustration of the probabilistic interpretation of quan¬ 
tum mechanics. 

2. The radial equation for the hydrogen atom. 

Atomic units 

The potential energy of the electron in the hydrogen atom, the 
electron being attracted via the Coulomb law to the nucleus 
(proton), is 

U(r) = — f (2.1) 

with r the distance from the electron to the nucleus, which can be 
considered the origin of coordinates and fixed, since it is more 
massive than the electron. (The ratio of nuclear mass to electron 
mass for hydrogen is approximately 1836.) By Eq. (3.16), Chap¬ 
ter IV, we arrive at the equation for the radial functions of the 


188 



SchrSdlnger’s Theory 


189 


hydrogen atom: 

d?R , 2 dR 
dr 2 ' r dr 


R + 2 « ( E + el ) R = 0 


( 2 . 2 ) 


If we were to account for the motion of the nucleus, we would 
have an equation of the same type, but instead of the electron 
mass, m, there would be the reduced mass 


m 


/ 


mM 
m + M 


(2.3) 


where M is the mass of the nucleus. 

Let us introduce a system of units with Planck’s constant h 
divided by 2jx, the electron charge, and the electron mass as base 
units: 

= 6.626 X lO" 27 erg s 

e = 4.80 X 10“ 10 esu 

m = 9.11 X 10 -28 g (2.4) 

In practical terms this means that in such a “natural” system of 
units the unit of length is 

a = ^ r = °.529X I0 ~ 8 cm (2.5) 


and the unit of energy is 


£o 



= <?- = 27.21 eV 

a 


( 2 . 6 ) 


called a hartree. The unit of velocity will then be e 2 /*, which is 
the 137th part of the speed of light. 

With this in mind, in Eq. (2.2) we put 


r i 


r 




(2.7) 


(do not confuse with the smallness parameter in perturbation 
theory, e). After this the equation reads 


d?R 

dr] 


If we then substitute 


£ + A^ + f 2e + JL_'I^')* = 0 

i r i dr \ \ 'i n ) 

«-7F» 

into Eq. (2.8), we get 


(2.8) 

(2.9) 

( 2 . 10 ) 
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where we put 

s = 2/+l (2.11) 

This equation occurs in a number of problems (to name some, 
Dirac’s theory of the hydrogen atom, the Stark effect, the scatter¬ 
ing of a-particles), and parameter s may not necessarily be an 
odd integer. This is why we will consider Eq. (2.10) in greater 
detail and will assume s to be not an integer but simply a non¬ 
negative quantity, s 5* 0 (obviously this can be done since s 
enters into the equation only through s 2 ). 


3. Solution of an auxiliary problem 

From the general analysis of the radial equation (Section 7, 
Chapter IV) we know that negative values of e correspond to a 
discrete spectrum and positive values to a continuous spectrum. 
To study the discrete spectrum we introduce 

x = r,(- 8e)‘ / * (3.1) 

as the independent variable and put 

X = (— 2e)" v * (3.2) 

Here X will obviously be real (we will consider it positive); xwill 
be real and will change within the same limits as r, namely, be¬ 
tween 0 and oo. Equation (2.10) then reads 

+ + + < 3 - 3 > 

or 

--&■(*# )+(t+£>-»» < 3 - 3 ‘> 

The operator on the left-hand side of (3.3*) is hermitian, and X 
plays the role of a parameter. By introducing the quantities (3.1) 
and (3.2) we eliminate, so to say, the continuous spectrum and 
reduce the solution of Eq. (2.10) to the solution of an auxiliary 
problem, namely, the eigenvalue equation (3.3*). 

Let us study the behaviour of Eq. (3.3) for very small and very 
large values of x. We could use the results of Section 7, Chap¬ 
ter IV, but it is simpler to repeat our reasoning for Eq. (3.3), 

For small values of x we put 

y = x a + ax a ^+ ... (3.4) 

and get for a two values: 


o = ± s/2 


( 3 . 5 ) 
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For large values of x we put 



y = e~ ax x^ | 

E>+4+-) 

(3.6) 

and get for both 

a and {J two values: 



l 

a ~~2’ 

P=—y+A 

(3.7) 

and 

1 

a = -j, 

l>=-7-* 

(3.7*) 


From this we conclude that the sought solution for small values 
of x must be of the form 

y =* Cx sl2 {\ + ax + ...) (3.8) 

and for large values of x of the form 

y = Ce-**-* (1 + 4 + • • •) (3.9) 

Hence if we put 

y = e~ xn x m Q (x) (3.10) 

then function Q(x) must satisfy the following conditions: 

Q( x) is finite at x = 0 

Q(x) is of the order of x*—<«+D /2 as x-*-oo (3.11) 

and the equation for y, (3.3), leads us to the following equation 
for Q(x): 

*-0 + (s+i-*)-^ + (A-^)Q = O (3.12) 

This equation can be solved in two ways: by using power series 
or by using definite integrals. We will apply the first method here 
and use the second method when we solve an analogous equation 
for the continuous spectrum. 

Let us seek the solution to Eq. (3.12) in the form 

Q=Z a n x n (3.13) 

n=0 

When the series is substituted into the equation and the coeffi¬ 
cients of the powers of x are equated with zero, we get the re¬ 
lationship 

n(n + s)a„+(-n + A.-fi-|)a„_ I ==0 (3.14) 

which serves as an equation for successive determination of the 
expansion coefficients. 
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Coefficient a 0 remains arbitrary, and the other coefficients are 
expressed in terms of it in the following manner: 

(s+ i)/2-A. _ 

fl i-ni+i) a ° 

„ (s+D/2-A.+ l [(5+l)/2-A,][(5+l)/2-X+ll _ 

“2 2 (s + 2) 1 X2(s+ l)(s + 2) °° 

.(3.15) 

Hence, if we introduce the confluent hypergeometric function 

+ + (3.16) 


we can write 


Q = OqF s + 1; *) 


(3.17) 


Two cases are possible. If 


+ P, p = 0, 1, 2, ... 


(3.18) 


then a p +1 and all subsequent coefficients vanish and the series in 
(3.13) is terminated. For Q we then have not an infinite series 
but a polynomial. If condition (3.18) does not hold, the series 
in (3.13) is an infinite one. It converges, since the ratio of any 
two subsequent terms, 


a n x n _ x n — X — y 2 + si 2 
a n -,x n ~ l n(n+ s) 


(3.19) 


tends to zero for any x as n oo. From the same formula we see 
that all terms beginning from a certain one have the same sign. 
Hence the partial sums of the series grow faster as x-*-oo than 
any finite power of x, so that condition (3.11) does not hold. 
This means that the second case is unsuitable, and we have 



Thus the only solution to Eq. (3.12) that satisfies our condi¬ 
tions is the polynomial 



Q P — OqF (— p; s+ 1; x) 

(3.20) 

or in explicit form 


Qp — oo (1 — 

p X | pip — 1) x 1 . 

1! s + 1 1 21 (s + 1) (s + 2) "*■ • * * 


If we put 

+( 1) (s+ l) ... ( s + p )) 

a 0 «(s + 1 ) ... (s + p) —- 

(3.20*) 

(3.21) 
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the corresponding functions Q p , which we denote by Q s p (x), are 
polynomials in s as well as in x: 

Q?(*) — F(-Pi s + U X ) (3.22) 

or 

QJ(*)-(“ 1) P (x p -^(s + p)^-‘ 

+ (s + p)(s + p- l)^- 2 

... +(-l) p (S + P) ... (s+1)) (3.22*) 

These are the generalized. Laguerre polynomials ; for s = 0 they 
are called simply Laguerre polynomials. 


4. Some properties of generalized 
Laguerre polynomials 

The generalized Laguerre polynomials, which are the solutions 
to the differential equation 

d?Ql dQ s 

x^ + is + l- x)-g- + pQp = 0 (4.1) 

can be presented in the following form: 

QUx)^S-&e- x x i+p (4.2) 


To prove this let us multiply Eq. (3.22*) by x s e~ x and write the 
product thus: 

M=£ Or V + - + p r (0 £*’** 


P (P — 1) * P 


rr( e_JC )-TT^ +s + ••• + * 




d*"- 2 ; dx 2 


According to Leibnitz’s theorem for the nth derivative of the 
product of two functions, 

*W,C*>~ £re-x'*> <4.2^ 

which is what we set out to prove. 

If we use the Cauchy formula, we can present this as 



(z-x) p+l 


(4.3) 


13—2186 
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Next we introduce a new variable 


2 — X 


and get 


2 ni J 


1 


dt 


(1 — 0 S+1 t p+ ' 


But according to the Cauchy formula 

QpM-[dtP (,_/)•+iJt-o 

Hence we obtain the Taylor series involving Q s P (x): 

(1 J) 


P-0 


£«(*> 


(4.4) 


(4.5) 


(4.6) 


This is a convenient formula for deriving various relationships 
between generalized Laguerre polynomials. If we multiply it by 
1 — we get 


0 - l)~ s e- xm ~ t] = [Qp (X) - P Qp-\ (*)] 


(4.7) 


p-0 


On the other hand, changing s to s— 1 in (4.6), we get the 
same expression on the left-hand side. We then compare the 
coefficients in the two Taylor series and get 

Qr‘to==Qp(*)-pQp-iW (4.8) 

This formula allows us to express generalized Laguerre poly* 
nomials with values of s that differ from one another by an integer, 
in terms of such polynomials with equal (the maximal of the two) 
values of s. 

Let us differentiate both sides of (4.6) with respect to x. If we 
follow the same reasoning, we get 

= - pQ S pt\ (x) (4.9) 

Now let us differentiate (4.7) with respect to / and express 
both sides of the results in terms of series. We then find that 

sQ s p (x) - xQ s P +1 (x) = Qp+i (x) - (p + 1) Q p (x) 
or, after we change s to s — 1, 

xQ S p(x) = ( P + s)Q S p-'(x)-Q S p- + \(x) 


(4.10) 
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Last, we use (4.8) and obtain the recursion relation that connects 
three successive polynomials of the same order, s: 

(2p + s+l-x)Q s p (x) = Q' p+l (x) + p(p + s) (x) (4.11) 

It is also easy to find other relations by using (4.8), (4.9), and 
(4.10): 

* —§ 7 — + sQp (x) = (p + s) Q s ~' (x) (4.12) 

x + (s-x) Qp (x) = Q#, (*) (4.12*) 

Differentiating (4.12*) with respect to x and using (4.9), we arrive 
at Eq. (4.1) for Q p (x). 

These formulas, in turn, make it possible to derive the follow¬ 
ing relations: 

* dQSp a' x (X) +(P + s- x) Qp-i (x) = Ql(x) (4.13) 

dO s fjt> 

x—j nr --pQUx) = -p(p + s)Q;- t (x) (4.13*) 

which also lead to differential equation (4.1). 

In the future we will have to evaluate integrals of type 

00 

/ = J x s e~ x Q s p (x) f (x) dx (4.14) 

0 

For this it is convenient to transform the integral using (4.2) and 
integrating by parts p times. We will have 


00 00 

/=$ -£p(e- x x’ +s )f(x)dx = (- l) p $ e ~ x x p+s f w (x) dx (4.15) 


Putting 
we get 


/(x) = e(‘- 0 >* 


5 *V"<£ M dx - (a -1>' 5 dx— r (s +p +1) 


A more general type of integral. 


DO 

$ x ,+r e~ ax Q s p (x)dx 


(4.16) 


(4.17) 


13 * 
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can be evaluated for integral values of r by differentiating (4.16) 
with respect to parameter a. 

In (4.14) and (4.15) let us put /(x) = x r , which yields 


J x s * r e~ x Qp(x) dx = (—l) p r(r— 1)... (r —p+l)r(s + r-f- l) (4.18) 

o 


If f(x) is a polynomial of a degree lower than p, integral (4.14) 
vanishes. Using this property we can evaluate (4.14) when 

f W = x 2 Ql (X) = (-l) p (x p+2 — p{s + p) x p+l 

+ p{p J-' 1 (s + p)(s + p-l)x p + ...) 

f (x) = xQl (x) = (-\) p [x p+ '-p{s + p)x p + ...] 
f(x) = Q’ p (x) = (-\) p x p + ... 

— — 0 s (x)— r(5 + p + ! ) 1 I 

TW — jcVpW— r(s+1) x T... 


f (x)=-^rQp (x) — 


r<5-f P + i)' 

r<s+ i) 


(1 _2--L+ 'l 

U 1 S + 1 X ^ • •') 


where the omitted terms are polynomials of a degree lower than p. 
This yields 


0 

- pir (s + p + 1) [6 P 2 + 6p (s + 1) + (S + 1) (S -f 2)] (4.19) 

oo 

$ e~ x x s+l [Qp (x)] 2 dx = pir (s + p -f 1) (2p + 5 + 1) (4.20) 

0 
oo 

5 e-V[Qp (x)] 2 dx = p\r(s + p+l) (4.21) 

0 
oo 

5 e-V [Qp (x)] 2 dx = pir (s + p + 1) 4 (4.22) 

0 

5 «-V* [Qp (X)] 2 dx = pir (s + p + 1) (4.23) 

0 

Let us show that the polynomial Qp(x) has exactly p positive 
zeros, and that these zeros are real. If the number of zeros, say q, 
were less than p, then, denoting them by ai, a 2 , as.a 9 , we 
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could build the function 

/(*) = (* —a,)(* — Oj) ... (x — a q ) 

The product of this function by Q s p ( x) would then not change sign 
if x were to change from 0 to oo, and (4.14) would not vanish. 
But this is impossible since f(x) is a polynomial of a degree 
lower than p, and, according to (4.15), integral (4.14) must 
vanish. Hence the number of zeros cannot be less than p, and 
since it cannot be greater than p, it must be p. 


5. Eigenvalues and eigenfunctions 
of the auxiliary problem 

We return to the auxiliary problem of Section 3. The problem 
was to find the eigenfunctions and eigenvalues of the equation 


d 

dx 




We found the eigenvalues to be 
. n , 5+1 

A = -S-, 


P — 0, 1, 2, ... 


(5.1) 


(5.2) 


and the eigenfunctions were expressed in terms of the generalized 
Laguerre polynomials: 

y p (x) = c/ n e~ x %{x) (5.3) 

The constant c p we will find from the normalization condition 


\ [y P (x)?dx = 


1 


Recalling formula (4.21), we evaluate the integral and get 

-_ ’ 

p [pir (s + p + 

Thus the functions 

»'«-5r547+ 

are orthonormalized: 

oo 

5 y P (x)y P '(x)dx=t> P p' 


(5.4) 

(5.5) 

(5.6) 


and {y P (x)} is a complete set. 
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If we use (3.22), we can write 

y P (*) = r{s l +\) ( r(5+ pi + 1} ) U x sl2 e~ x,2 F (- p\ s + 1; x) (5.7) 


It is sometimes convenient to use the normalized polynomials 

Q' P (*) 


Qp W [pir(s + p + i)] 1/j 
which yields the following expression for y P (x): 

y p (x) = x sl2 e- xl2 Q' p s (x) 


(5.8) 


(5.9) 


6 . Energy levels and radial functions 
for the discrete hydrogen spectrum 

Let us now solve our physical problem. First we will find the 
hydrogen energy levels. The energy parameter in atomic units, e, 
was related to parameter X of the auxiliary problem thus: 


where X was equal to 

* = P + -T 1 - P = 0, 1, 2, ... (6.2) 

Parameter s was connected with the azimuthal quantum number, 

s = 2/ + 1 (6.3) 

and p (integral values only) was equal to the number of nodes 
of the corresponding radial function, that is, according to the 
definition in Section 8, Chapter IV, to the radial quantum number, 
tir. Hence X is the integer 

X = n r + l+l=n (6.4) 

which by definition is the principal quantum number. 

The energy levels in atomic units are 

^ 2 a ? > ^ ~ 1» 2, •. • (8.5) 

Hence we see that they depend only on the value of the principal 
quantum number. This special property of the Coulomb field is 
of a fundamental nature. It is connected with the group of 
transformations under which the Schrodinger equation (in mo- 
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mentum space) for the hydrogen atom remains invariant. This 
group, which characterizes the increased symmetry of the hydro¬ 
gen atom, coincides with the rotational group of a four-dimen¬ 
sional sphere. We will return to this point in Part IV. 

In conventional units the energy levels of the hydrogen atom 
are 


£ = 



2nRh 

n* 


( 6 . 6 ) 


where 


P me * _ 2n i me i 

" 4jifi 3 h 3 


(6.7) 


is the Rydberg constant. Its value is 

R = 3.2898421 X 10 15 s~ l (6.8) 


According to the remark made in Section 2, if we want to 
account for the finiteness of the mass of the nucleus, we must in 
all formulas [in (6.7) as well] change the electron mass to the 
reduced mass 


The frequencies of the spectral lines are given by the Bohr 
frequency relation 




u nn' 

2 n 




(6.9) 


If n' — 1 and n has values 2, 3, .... we have a series of lines 
called the Lyman series. If n' = 2 and n = 3, 4,... we have the 
Balmer series, and if n' = 3 and n — 4, 5,... we have the 
Paschen series. 

Let us now express the radial functions in terms of the general¬ 
ized Laguerre polynomials. Argument x in these polynomials is 
related to the distance in atomic units, n, in the following way: 

x = ^ ( 6 . 10 ) 


[see formulas (3.1), (3.2), and (6.4)]. We use (2.9), (5.9), and 
(6.2)-(6.4) to get 

Rm (r ,) = c n (^-/ e- r ' ln Qf-V-i (-^) ( 6 . 11 ) 

where c n is a normalization factor determined from the condition 




( 6 . 12 ) 
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To evaluate the integral we introduce the variable x according to 
(6.10). Then 

4( f ) 3 \ (*>]■<(*-1 

0 

If we recall the relation between n and /, and p and s, we can 
express the above integral in terms of the ratio of (4.20) to 
(4.21). This yields 

oo 

\ x s+ 'e~ x [Q* s (x)] 2 dx — 2p + s+\—2n 
o 

whence 

4 = 7T. or 0 .—Jr < 613 > 

so that the normalized radial functions are 

Rm (rO = jr (-^y e- r ' ln Q-”tl , (-^-) (6.14) 

If we now use the expression for Q* in terms of the confluent 
hypergeometric function and keep in mind that l is an integer, we 
get 

Rm (»■.) =[(»-/)(»-/+1) (* + Q] v * 

X $ r ± l)l e- r ' ,n F (-n + / + 1; 21 + 2; ^-) (6.15) 

where according to definition (3.16) 

/=•(—« + /+ 1 ; 2 / + 2 ; -^) 

, n — l— 1 2r, ■ (n — /—!)(« — / — 2) / 2r, \ 2 |fil 

1 (2/+ 2)1! n _r (21 + 2) (2/ + 3) 2! V n ) ' D,1D ' 

This leads us directly to the asymptotic expansion of R n i{r{) 
for very large values of n. The limit of (6.16) as n-> oo is 

1 _ 2r i ._ (2r,) 2 _ 

1 (2/ + 2) 1! ' (21 + 2) (2/ + 3) 2! 

= (2/ + l)!(2r 1 r'- 1/2 / 2;+1 (V8n) (6.17) 

where / 2 m is the Bessel function of order 21+ 1. This yields 

lim n u R nl (r,) = / 2/+1 (V^) (6.18) 

n->oo \'is 


Functions (6.18), in contrast, belong to the continuous spectrum. 
The functions R n i(r 1 ) are eigenfunctions of the Hamiltonian and 
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possess the property of being orthogonal and normalized: 

oo * 

< 6 -W) 

0 

These functions, however, do not make up a complete set because 
the Hamiltonian has a continuous spectrum in addition to the 
discrete spectrum. 

Finally, we write some simple radial functions Rni(ri) 


R\o (ri) = 2e~ r> (6.20) 

«*><<■,) = (6.21) 

R„(r,) = -j=r,e-v ( 6.20 

R X < r .):= 41 .-"*[1 -+ | (^) ! ] ( 6 - 22 ) 

«s, <n) - ‘-'"[l “ 7 (t)] < S - 22 ’> 

RM= jm e - ra (ff (6 - 22 "> 


7. Solution of the differential equation 
for the continuous spectrum in the form 
of a definite integral 

We turn to the case of the continuous spectrum. In Eq. (2.10), 
which we write again, 


d?y . J_ dy 
drf r, dr. 


+ 




(7.1) 


parameter e is a positive number. The variable 

X\ = r, (8e)' / ‘ 

is real. We also put 

X, = (2e) -,/l 

Equation (7.1) then reads 


*1 


(7.2) 

(7.3) 

(7.4) 


which differs from Eq. (3.3) in the sign of one of the terms. We 
can obtain this equation directly if in (3.3) we put 

x — ix\, X = — i'X, 


(7.5) 
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This means that we can use the results of Section 3 and state that 
the solution to (7.4), finite at x — 0, is the function 

y — e~ ix ' l2 xf q(x x ) (7.6) 

with Q satisfying the differential equation 

+ + + [ Xl ~? (s+ 1} ] Q = ° (7J) 
This implies that Q can be represented by a series: 

Q = SF(-^ 1 + i^u s+1; /*,) (7.8) 

We will need the asymptotic expansions of Q and y for large 
values of X\. The easiest way to obtain them is to express Q in the 
form of a definite integral. Equation (7.7) can be solved by using 
Laplace’s method, which consists of the following. We seek the 
solution to (7.7) in the form of a contour integral in the complex 
z plane: 

Q = \e‘x«f( z )dz (7.9) 

where f(z) is still to be determined. Substituting (7.9) into (7.7) 
and differentiating under the integral sign, we get 

*i ^ e ix ' z (— z 2 + z) f (z) dz 

+ J e lx ' z [K + {(s + 1 )(22 - 1)] / (z)dz = 0 

To get rid of the factor *i in the first term we integrate by parts. 
This yields 

^ z (1 — z) f (z) d (— ie lx ' z ) 

= — ie lx ' z z( 1 — z) f ( 2 ) £ + i \ e ix ' z -~- [z (1 — z) f (z)] dz 

where the limits of integration are denoted by a and b. 

If we now require that the first term on the right-hand side of 
the above expression vanish, that is 

e ix ‘ z z(l-z)f(z)f a =0 (7.10) 

the substitution of (7.9) into (7.7) results in 

i ^ e lx “ (z (1 - z) -£ - ^- (1-2 z)f (z) - ikj (z)) dz = 0 (7.11) 

We can satisfy this equation if we nullify the integrand, that 
is, if f(z) satisfies the differential equation 

f'(z) s-1 1-2 z . a, 

f(z) 2 z(l-z) z(l-z) 


(7.12) 
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which we solve and get 

log/( 2 ) = - i ^log[ 2 (l — z)] + a, log T -^ + log c (7.13) 
Whence 

f (z) = c 2 ,fc >+<*- 1 « 2 (1 - 2 )" a '+< J -W 2 

We see that the solution to (7.7) is 

Q = c $e'* ,z 2 U,+(s ~ 1)/2 (l - z)- iK+is ~ m dz (7.14) 

only if we choose the contour of integration so that (7.10) is 
satisfied and we write it as 

e i x « z *,Ms + o/2 (1 _ z) -*,«s + w |» = 0 (715) 

The solution to (7.7) must be finite at x — 0. Such a solution 
can be found if we take the contour of integration that passes 
from 2 = 0 to z — 1 along the real axis. This contour satisfies 
condition (7.15) only if s + 1 > 0, which is always the case 
since we assume that s 0. Substituting the limits of integration 
into (7.14), we get 

Q = c J e ^V^ +<s_1)/2 (l - z)~ iKMs ~ m dz (7.14*) 

o 


To see whether the integral really does coincide with the series 
(7.8) we expand the exponential into a power series and then 
integrate termwise. If we use Euler’s integral (or the complete 
beta function) 

B(p, q) = ( 2 P ~ 1 (1 - z)"~ l dz = ■ rffi+ffi - (7-16) 

o 

we find that 


Q = c £ Ulf $ 2 <v+*+(.-iw (1 _ z) -M-™ dz 

k-0 0 

= c y (,„)* r (- £ T i + ‘ +a ‘) r ( i T i - a 0 


fe-0 


or 


Q — c 


T(s + k+l) 

r ( i T i + a O r ( £ T i - a ') 


T(s+ 1) 


F s+ 1; ix i) 

(7.17) 
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which is what we set out to prove. By comparing (7.8) and (7.17) 
we see that the constants g and c of (7.8) and (7.14*) are con¬ 
nected in the following way: 


r 

g — c— 


( 


5+1 

2 


+ a ') r (- i T J ~-‘ 1 ‘) 
T(s+ 1) 


(7.18) 


We have thus found an expression for the confluent hypergeo¬ 
metric function: 


F { I r 1 + i ^ s + l > ix >) = 


r (s + 1 ) 


r ( i ±i +tt ,) r (i± i - tt .) 

i 

X J e ix ' z z a ' +(s ~ m (1 - 2 )- < *‘ +(s_,)/2 dz (7.19) 


Since (7.8) and (7.14*) are solutions to Eq. (7.7) and are finite 
at Xi =0, we could derive (7.18) simply by comparing the two 
at xi == 0. If in (7.19) we put 2=1 — z lt we get the relationship 

F s -f 1; i) = e ix F (-i±i - tt,; s + l;- «i) 

(7.20) 


This means that function y, defined by (7.6), will be real only ifg 
in (7.8) is real, which is what we will be assuming from now on. 


8. Derivation of the asymptotic expression 

To derive the asymptotic expression of Q for large positive 
values of X\, using (7.14), we deform the contour of integration 
in (7.14) as follows. Instead of joining points 0 and 1 by a straight 
line we construct a broken line that passes from 0 to iA, then 
from iA to iA -f 1, and finally from iA + 1 to 1 (« is any finite 
positive number). Since in the part of the complex z plane lying 
between the deformed contour of integration and the straight 
line from 0 to 1 the integrand in (7.14) is a holomorphic func¬ 
tion, the deformation will not change the value of the integral. 
If we let A grow to infinity, the integral along the section from 
iA to iA -f 1 will tend to zero because of the exponential e 
under the integral sign. In the limit we get 

Q = J e tx,1 f (z) dz = J e lx '*f (z) dz + $ «**'*/ & dz ^ 

0 0 1+ioo 
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where f(z) is found by solving (7.13). In the first integral on the 
right-hand side we put 

z = £e W2 (8.2) 

whereas in the second we put 

l_z = £e-W2 (8.2*) 

The limiting points for £ in both integrals are 0 and oo. We have 


Q— ce 


-nX,/2+i(s+I) n/4 


jj <r* ,t £ (s - 1)/2+a ’ (i — /£)' 


(s-D/2-U, 


dt 


+ e ^ ce -^l2-l(s+l) «/4 f + .^-D/2+U, ^ ( g <3) 


Now let us introduce a new variable 

t = &i 

which yields 

r (^ti + a,) , 


(8.4) 


Q __ ce i (s+1) n/4-nX,/2 . 


„(s+0/2-t-iA, 

K l 


4_ pix ir p-l (s+l) n/4-nA, 2 __L_£_ 

' e r <,s+m-iK 


) 7 


(8.5) 


where / denotes the integral 


J e - V ,- I )/2 + a^ 1 _._Ly-*>^ d/ (86) 


r (^i + a,) 0 J 


and 7 the complex conjugate of /. The asymptotic expression for / 
can easily be obtained. For this we need only expand the inte¬ 
grand in inverse powers of Xi and then integrate termwise. In 
view of the fact that the series 


(*-'£) 


(S-D/2-/X,, 


= 1 


iX\ — (s — l)/2 it 
1! x i 



(8.7) 


converges only if |f| < |jcj|, whereas with respect to t we inte¬ 
grate to infinity, the series obtained by termwise integration will 




206 


Fundamentals of Quantum Mechanics 


be a divergent, or asymptotic, series. Thus we have 

+ a,+-i-+f; -£■) 

where F%) (a; P; z) is a formal power series, that is 

c a. < i °P _ i <*(<*+ 0 P (P + 1) _2 i 
r 2 o(«; p; z)= 1 + —ZH- 21 - 2 + • • ■ 

By (7.17) we can write our results as follows: 

e~ ix ' ,2 F + ft-il s + 1 ; /*,) = e ix ' n F — ft,; s + 1 ; — /*,) 


( 8 . 8 ) 


(8.9) 


r(» + D 


■ r(s + i) 
XF 20 (-ft, +- 


e l (s + 1) n/4-nA,/2 JC -(s+l)/2-a,g-<*,/2 


X ^20 (ft>l 


*, & 1 4- 2 


j + a . _£_ 




g-i (s+1) nl4-n\,/2 x -(s+l)/2+a, e lx,l2 


; — + 


1 + s ._ i_ 


7 ) ( 8 . 10 ) 


where the equality is to be understood as asymptotic equality. 
The last formula holds not only for real values of Xi and Xj but 
for their complex values as well, provided — n/2 sg: arc X\ it/2. 
For instance, if we put 

A,, = ft, = i + p), Xi = — ix 


with p a positive integer, the second term in (8.10) will vanish 
because 


and (8.10) will give us not only asymptotic but exact equality: 


e -x/2F(—p- 5 + 1; x) 


(-D j? r(s + i) 
r(s + p+ i) 


r'Wjo ( —p 


s; 


-p;--j) (8.ii) 


It is clear from a comparison of (8.11) with (3.11) and (3.22) 
that (8.11) gives polynomials Q s p (x) in increasing powers of x 
(on the left) and diminishing powers of x (on the right). 
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9. Radial functions for the continuous hydrogen 
spectrum 

On the basis of the results of the previous sections [see (2.9), 
(2.11), (7.2), (7.3), (7.6), and (7.8)] we can write an expression 
for the radial wave functions for the continuous spectrum of the 
hydrogen atom: 

R ei ( r i) = a («) e ~ lr ‘ ^ ( r i V8^)' 

Xf(/+1 + —r; 2/ + 2; it\ V8e") (9.1) 


The functions are real only if a(e) is real. To obtain the asymp¬ 
totic expression of the function for large values of r t we sub¬ 
stitute into (8.10) the actual values of s, Xi, Xi and put 


old 


r (' +1 + vM r (' + ' + vM 


(9.2) 


Also, we replace the series F 20 by their asymptotic value F 2 o — 1. 
We then get 


R*i (r i)» 


a(*) j 


( 2 /+ 1)1 


( ,+ l+ vir) 


g-n/V8e 


_J_ 

^/2e 


X ~j~ cos V2e + -^=- log (r t V8e) — (l + 1) j + a) (9.3) 


which we normalize thus: 


lim 

Ae-*0 


oo 



0 


e + Ae 

s 


e 


R*i( r d de 


dr, 


(9.4) 


It is sometimes useful to introduce instead of e a new parame¬ 
ter k: 


e = f(k) 


(9.5) 


where f(k) is a monotonic function. We can then consider eigen¬ 
function R(k, ri) normalized thus: 


oo 


lim 

A*-K> 



k + Mi 

J R(k, r,) dk 

k 



1 


(9.6) 



208 


Fundamentals of Quantum Mechanics 


Let us find the relationship between the functions with different 
normalizations. If Ae and Ak are considered positive, 

Ae = f' (k) Ak 

e+Ag A + Aft 

5 R tl ( ri )dB = f'{k) \ R (k, r { ) dk 

a k 

because Ak is infinitesimal. Substituting into (9.4) and compar¬ 
ing with (9.6), we get 

R(k, r l ) = R el (r l )\-^\ , ‘ (9.7) 

We can transform (9.4) or (9.6) in the following way. Since 
the proper differentials belonging to different parts of the con¬ 
tinuous spectrum are orthogonal to each other, we can write (9.6) 
as 


oo * k+ Afe k + Aik v 

lim ^$ r M S *(*'. n)dk' $ R (k", r ,) dk" 1 dr, = 1 (9.8) 

Afc " >0 0 l ft ft-A,ft J 

where A\k > Ak. We can pass on to the limit for Ak -+• 0 with 

A\k remaining finite. This yields 

oo ft+Aife 

\ r\R (k, r,) J R(k\ r x )dk'dr { = 1 (9.9) 

0 A-A,A 

Now let R°{k, ri) be the unnormalized functions and c(k) the 
normalization factor, so that 

R (k, r,) = c (k) R° (k, r\) (9.10) 

Since A\k is infinitesimal, we can always take c(k) outside the 
integral sign and thus obtain an equation for c(k): 

oo fe + A.fe 

17^=5 r*R°(k, r,) J R%k', r,) dk'dr { (9.11) 

0 ft-A,S 

(On the right the integral does not depend on Aik, although it 
might seem to at first glance.) 

These considerations apply not only to this particular example 
but to the general case of normalizing eigenfunctions in a contin¬ 
uous spectrum. 

To evaluate integral (9.11) we must partition into two parts 
<the domain of integration with respect to r t : one interval is from 
zero to a value r t = A, and the other is from A to oo. The integral 
over the finite interval (from 0 to A) tends to zero as A^->0. 
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What remains is the integral from A to oo: 

oo /i + A t A 

■HRW “ Jil. S S P?(k\r^dk'dr t (9.12) 

1 A fe —A,fc 

The advantage of this formula is that by making A sufficiently 
large we can use the asymptotic expression for R°(k,r,). 

We put 

k = (2e)* / * (9.13) 

and for R°(k, r,) use the function having the asymptotic expres¬ 
sion 

R°(k, r,) = ^-cos (6r, + \\ogr, + y) (9.14) 


where y is a vector function of k that can be found by comparing 
(9.14) with (9.3). It can be shown that when evaluating (9.12) it 
is possible to consider k~ l log r, 4- y under the integral sign as 
being negligible, since these terms are small compared to the 
leading term, kr\. Keeping only this term, we get 


_l_ 

|c W 


We prove that 


oo 

lim \ cos kr, 

A,fc-*0 J 
A 

oo 


k+^k 

\ cos k'r, dk' dr, 

fc-A,fc 


= lim 

A.*-»0 


$ (1 + cos 2fer,) sin l *' kr & dr, 


oo 


lim \ 

Ai*-»0 £ 


cos 2 kr, sin &.,kr, 

r, 


dr, = 0 


(9.15) 

(9.16) 


Indeed, this is equal to 


lim 

A,fe-*0 



sin (2k + A|fe) r, 
. — 


dr, 


1 f° sin (2k — \,k) r I 

2 ) r, 

A 



and the brackets contain the difference of two convergent inte¬ 
grals, which at A,k = 0 coincide. What remains is 

(9 - i7 > 

A 

If we introduce a new variable, t = r,A,k, we get 


oo oo 



(9.18) 


14—2186 
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so that we finally obtain 

c(fc) = (2/jt) Vl (9.19) 

Hence the eigenfunctions normalized with respect to k will 
have the asymptotic expression 

R ( k, r,) « (|-)'' cos [kr i + \ logr, + y) (9.20) 

whereas those normalized with respect to e will have, according 
to (9.7), the asymptotic expression 

^e/(' - i)« (-^■) /, (-^) /j -^ cos ('"i V2e +-^=-log/-, + yj (9.21) 


Formula (9.3) now provides the following expression for the 
factor a(e ) in (9.1): 


a(e) = -^(8e)' / ‘ 


e n/Vse 
(21+ 1)! 




(9.22) 


We can also express 

K ,+1 + v?r)f“ r ( ,+1+ w) r (' +| -w) 

in terms of elementary functions. If we use notation (7.3), mul- 
tiply 

F (/ -j- 1 -f- t’A.i) = (/ -j- IX\) ... (1 -1- iAi) r (1 -|- i\\) 
and 

r (f +1 - /A.,) = {i- a,)... (i - a,) r (i - a,) 

and use the relationship 

r (i + a,) r (i — a,)(M3) 

we find that 

|r((+1 + a,)| ! =(i , + ).' 0 ( 2 ’ + >.!) (<’ + «)-5f,{SrSJ7 < 9 - 24 > 


l r ( , + 1+ vt)l 

-(' + ± X 4 + £ V "(' + £ h^W (9 ' 24 '> 

Substitution of (9.24*) into (9.22) yields the final expression for 
a(e): 

a ( e ) = 2{fl + l/(2e)1...[/ 2 +l/(2e)ll 1/ » 

(l -exp[-;i(2/e)' /l ]}' /j (2/+ 1)! 


(9.25) 
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With this value of a(e) formula (9.1), namely 
/?.z(r,) = a (e)e-"'V*(r, VST)' 

X F(l + I + 2/ + 2; tr, V&Q (9.1) 

gives the normalized radial functions for the continuous hydrogen 
spectrum. 

To end Section 9 we give the asymptotic expression for R t i{r\) 
as e->0. By (6.17) and (6.18) we have 

lim R el (r,) = U J 2 , +i (y^i) = Hm n^Rm (r, ) (9.26) 

e->0 v ' i ' n->oo 

10. Intensities in the hydrogen spectrum 

Knowing the radial function, we can calculate the intensities 
of spectral lines, which correspond to various transitions, and the 
intensities in the continuous spectrum. We recall the formulas for 
intensities which we derived in Section 9, Chapter IV. We also 
introduce atomic units and use the fact that in hydrogen the 
energy levels do not depend on the value of the azimuthal quan¬ 
tum number, /. Thus for transitions in the discrete spectrum 

/(»'/; til— 1) = (e n ' — e n ) 4 1 r {n'i, nl-\)fl (10.1) 

Hn'li nl-\- 1) = (e„' — e„) 4 1 r (n'i, nl + 1) | 2 (Z + 1) (10.1*) 

and for transitions from the discrete spectrum into the continuous 
spectrum 

/ ( n'i ; e) Ae 

= (e„' — e) 4 [ | r' (ti'l; el- 1) | 2 / +1 r' (n'i] e/+ 1)| 2 (/+ 1)] Ae (10.2) 

For arbitrary values of quantum numbers n', n, l the integers 

r(n'l\ n/± 1) are rather complicated. But for small values of n' 
the different R n ’i are simple polynomials, and hence we can 
easily integrate with respect to r for arbitrary n or e. We restrict 
ourselves to the Lyman (n' = 1) and Balmer (n' — 2) series. 

At n' — 1 the only possible value of /' is zero, which means 


that we either evaluate 


oo 

r (10; «i)=$ 

n 

(10.3) 

u 

or 

oo 


'«0; '1)-S'?*, 

(10.3*) 


o 


14 * 
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At n' = 2 there are two values of zero and unity. At /'= 0 

oo 

r (20; nl) = \ r*R„ (r,) R nl (r ,) dr , (10.4) 

and at /'= 1 

oo 

r (21; «0)=Sr3/ ?21 (r 1 )/?„ 0 (r 1 )dr 1 (10.5) 

0 
oo 

r (21; n2)= J rj/? 21 (/■,) R^ (r ,) dr t (10.6) 

o 

with similar integrals for the continuous spectrum. All the inte¬ 
grals can be evaluated in closed form with the help of (4.8) and 
(4.16) and their immediate generalizations for continuous-spec¬ 
trum eigenfunctions. These formulas are 

Q s P (x) = Qp +l (x)-pQr + - \(x) (4.8) 

oo 

5 x s e~ ax Q s p (x) dx = r (s + p + 1) (4.16) 

o 

To obtain this generalization let us express the generalized 
Laguerre polynomial Q s p in terms of the confluent hypergeometric 
function, F, using formula (3.22). We obtain 

(s-f l)F(— p\ s-f 1; x) 

= (s+ l+p)F(- P] s-f 2; x)-pF(-p+ 1; s + 2; x) (10.7) 

oo 

5 e~ ax x°F {-p; s-f 1; x) dx = (l - (10.8) 

In the second expression we can introduce a new parameter b 
by changing a to alb and x to bx. This yields 

oo 

$ e~ ax x s F ( p\ s -f 1; f>x)d* = ^±^-( 1 - (10.9) 

0 

Our results are valid not on y for integral values of p but for 
any fractional and complex va ues as well if constants a and b 
satisfy the condition |a| > |b and integrals (10.8) and (10.9) 
exist. To prove that (10.7) hods it is sufficient to compare the 
coefficients of powers of x on both sides of (10.7), and to prove 
(10.8) and (10.9) we only have to express F in the form of a 
series and integrate termwise. 
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We now turn to integral (10.3). Substituting the radial func¬ 
tions from (6.14) and (6.20), we get 

r (10; nl) 

co 

— (l -^-) V ’S r 4 e -(l + l M r F (— n -f 2; 4; ^-)dr (10.10) 

o 


Clearly, this is not an integral of type (10.9) since in (10.10) 
the power of r coincides with the second argument in F, whereas 
in (10.9) this argument exceeds the power of x by unity. To 
evaluate (10.10) we must either differentiate (10.9) with respect 
to a or, using (10.7), increase the second argument in F by unity 
and then use (10.9). In either case we get 

oo 

$ r*e~u +lln)r F (- n -f 2; 4; -^) dr 

o 

/ « nb— 3 x , \ —n—3 

= 12 (,_ 1 ) (,+ i .) ( 10 . 11 ) 


Substitution into 


(10.10) yields 
r(10; nl)= 16 n 7,t 


(n — \ ) n ~'h 
(n + l) n+t/ ’ 


( 10 . 12 ) 


Integrals (10.4)-(10.6) are evaluated using the same method. 
After lengthy calculations we get 


T (20; nl) = 2 8 V2n / *(« 2 - l)' 1 ' + 


n—3 


r ( 21 * n 0) — -^~n 912 — _—- 

’ ’ V 6 (* + 2)" +3 


»io 


r(21; n2) = rC'* (n? — l) ,/a 


(n — 2) ft ~ ,/l 
(n + 2) n+,/l 


(10.13) 

(10.14) 

(10.15) 


In the same way we can calculate the integrals for the con¬ 
tinuous spectrum. For this we need not repeat all the steps but 
simply use (10.12) ~ (10.15). We reason as follows. If we compare 
(6.15) with (9.1) and (9.25), we see that the radial functions of 
the discrete and continuous spectra can be expressed in the fol¬ 
lowing form: 

Rni(ri) = ArQ(n,l,ri) (10.16) 

n 12 


Ki(ri) 


2Q(l/(—2e) l/l ,t,r l ) 

{l — exp [— n (2/e) ,/l ]}' / s 


(10.17) 
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where in both cases Q is the same analytic function of n, namely 

X(2i+7)T e '"" f (-» + '+ l;a + 2;f-) (10.18) 

On the other hand, integrals of type (10.9) are also analytic 
functions of their parameters. For instance, formula (10.11) holds 
for pure imaginary n — i/( 2e)‘ / >. Hence, if we put 


1 /I \ rt—*/j / i \ —rt—7a 

Y« Vi r(10; nl) = 8(l — —) (l + -£-) =f(n) 

then by (10.3), (10.3*), (10.16), and (10.17) we get 




or 

r'(10; el) 


r' (10; el) = T - 2 / (</(2e) ,/ 0 

{l — exp [— n (2/e) /j ]} h 


16 f 1 + / [ 1 - l (2e) l/ »]-^ a - ,/ » 

(1 - exp [- n (2/e) ,/j 3}‘ /2 


(10.19) 


( 10 . 20 ) 


( 10 . 20 *) 


We could obtain r'(20;el), r'(21; e0), and r / (21;e2) in a simi¬ 
lar manner, but this can be done in a simpler way: by transferring 
to the continuous spectrum in the final formulas for intensity. 

To find the intensity of transition between two energy levels we 
must sum (10.1) and (10.1*) over all possible values of /. 

For the Lyman series the sum reduces to one term 


7(l;n) = (l- I i r )V(10; n\)f 


so that by (10.12) 


HU n) - 


2 4 (ft - l) 2n ~‘ 
n (n + l) 2n+1 


( 10 . 21 ) 


For the B aimer series we have 


us <.>=■*-(i—s0‘ 

X[If (20; »1)P + Ir(21; *0)| 2 + 2|r(21; n2 )\*\ 
which according to (10.13) - (10.15) yields 

/ (2; n) = — ? (5« 2 - 4) (3/i 2 - 4) (10.22) 

If we want to find the intensities in the continuous spectrum, we 
must multiply (10.21) and (10.22) by the square of the ratio of 
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the factors of Q in (10.16) and (10.17), namely, by 

_ n* _ 

1 — exp [— n (2/e)' /j ] 

after which we change n to i/(2e)' / ». The result reads 
7 (1; e) Ae _ 

16 [1 + / (2e) 1 ^] 2 -/^ 8 - 1 [l - > (2Ae 
1 — exp [— it (2/e)'^] 

7(2; e)Ae 

8 (5 + 8e) (3 + 8e) [l + 2 i (2e)' /i ] 2f/ ^ 2 ~ 5 ~ 3 
1 — exp [—it (2/e)' /2 ] 

X [l — 2/ (2e) Vl ] 3 Ae 

To get rid of imaginary quantities we put in (10.23) 
(2e) /j = tan rji, 0 < t]i < n/2 

and in (10.24) 

2 (2e)' /j = tan q 2 > 0 < r\ 2 < n/2 

Such substitutions lead to the following result: 


wtt, ^ _ I 6 exp (— 4tn cot m) tan r^i Aru 
' ’ ' 1 — exn (—2lt cot t]i) 


7(2; e)Ae 


2 exp (—8tu cot t} 2 ) (1+4 cos 2 tu) (1+2 cos 2 r| 2 > tan ru A^ a 
1 — exp (— 4n cot t)j) 


(10.23) 

(10.24) 

(10.25) 

(10.26) 

(10.27) 

(10.28) 


In conclusion we note that the intensities observed in experi¬ 
ments depend not only on the properties of separate atoms but on 
the number of atoms in the “initial” state, which will differ with 
experimental conditions. For this reason a comparison of our 
results with experience can be only indirect. 


11. The Stark effect. General remarks 

If an atom is placed inside an electric field, its energy levels 
and hence spectral lines, which correspond to transitions between 
the levels, are split, generally speaking, into several components. 
The splitting of spectral lines in an electric field is called the 
Stark effect. For the hydrogen atom this splitting is proportional 
to the field; for other atoms it is proportional to the square of the 
field. This difference can be explained by the following. When 
there is an electric field directed, say, along the z axis, the x com¬ 
ponent of angular momentum is a constant of the motion, which 
implies that the magnetic quantum number, m, still assumes a 
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definite value. In the case of a non-Coulomb field each energy level 
E=E n i will have corresponding to it only one eigenfunction with 
a definite value of m, whereas for a Coulomb field there will be 
several eigenfunctions. (They will differ by their azimuthal quan¬ 
tum number /.) In other words, the energy levels for a non-Cou¬ 
lomb field will be nondegenerate, and for a Coulomb field 
degenerate. This requires applying perturbation theory in diffe¬ 
rent ways. Since the additional term in the potential energy is 
proportional to the z coordinate, the first-order correction to 
the unperturbed problem is zero for nondegenerate eigenvalues. 
Indeed, this correction is the diagonal matrix element, which 
vanishes owing to the selection rule for z. Hence for a non-Cou- 
lomb field it is the second-order correction that does not vanish, 
and this is proportional to the square of the external field. The 
situation changes for a Coulomb field (degenerate eigenvalues). 
The first order correction, which, we recall, is proportional to the 
field, is calculated by nullifying the determinant D(X) [Eq. (5.4), 
Chapter II], and it so happens that the correction is not zero. 

We note that the state of an atom placed in an electric field 
is not stationary, strictly speaking. From the fact that for very 
great distances from the atom the electron’s potential energy 
tends to — oo it becomes evident that there is a (nonzero) proba¬ 
bility for the electron to separate itself from the atom. Thus pertur¬ 
bation theory provides an approximate solution to the Schrodinger 
equation in the sense of Section 1, Chapter II, and Section 8, 
Chapter III, that is, a quasi-stationary state. 

In studying the Stark effect in hydrogen it is convenient to 
shift to parabolic coordinates instead of applying perturbation 
theory directly to the Schrodinger equation in spherical coordi¬ 
nates. This method has the advantage that in parabolic coordinates 
the perturbed equation allows for a separation of variables, and 
the problem reduces to solving equations whose unperturbed 
eigenvalues are non degenerate. We will restrict ourselves to the 
first-order correction to energy levels. 

12. The Schrodinger equation in parabolic coordinates 

We write the eigenvalue equation for the Hamiltonian in atomic 
units: 

1V 2 ^ — 1 ^ — gz$ = £HJ> (12.1) 

The perturbation Hamiltonian is — gz, where g is the electric 
field in atomic units. If we denote the electric field in convention¬ 
al units by D, then 

D — -£?g = g X 5.14 X 10 9 V cm -1 


( 12 . 2 ) 
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This is why even in very strong fields of 10 5 or 10 6 V cm -1 para¬ 
meter g will still be small. 

Let us introduce parabolic coordinates 


u = r-fz, v — r — z 


(12.3) 


The surfaces u = constant and v — constant represent an ortho¬ 
gonal system of paraboloids. This can be seen from the equations 


x 2 + y 2 + 2uz = u 2 
x 2 + y 2 — 2 vz — v 2 


(12.4) 


It is convenient to perform the transition to parabolic coordi¬ 
nates in three steps. First, we introduce the cylindrical coordi¬ 
nates z, p, <p. Then we put 


z + ip = t (i + /T l) 2 


(12.5) 


Finally, we express g and q in terms of u and v. Separating the 
real and imaginary parts in (12.5), we get 


z = j(l 2 ~r\ 2 ) 

p = in 

On the other hand, the modulus of (12.5) is 

r=4(£ 2 + T l 2 > 

Whence 


l—r + z = u. 


r\ 2 — r — z = v 


( 12 . 6 ) 

(12.7) 

( 12 . 8 ) 
(12.9) 


Next, we square the modulus of the differential of (12.5), 
express g and q in terms of u and v, and get 

dz 2 + dp 2 = (g 2 + q 2 ) (dg 2 + dq 2 ) 

= \ {u + v) (^-du 2 + -J- do 2 ) 

so that the square of the element of arc length 

ds 2 = dz 2 -f- dp 2 + p 2 dq> 2 (12.10) 

is 

ds 2 — u ^ v du 2 + “ ^ ° dv 2 -j- uv dqp 2 (12.11) 

The square root of the product of the three terms on the right- 
hand side of (12.11) is the volume element: 


dx = 4- (« + v) du dv d<p 


(12.12) 
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What remains to be found is the expression for the Laplacian 
operator in parabolic coordinates. The simplest way to do this is 
to write the general expression for the Laplacian operator in 
terms of curvilinear orthogonal coordinates q u q 3 , <h- 

=_!_r~ 

^ hihthi La«7i V A, dqi ) 

. d ( A3A1 dt|> \ , d ( hxh-t dty VI ,ov 

^ dqi V A, dq a J'T dq 3 \ As dq 3 )} 

where hi, hi, h 3 are the square roots of the coefficients in the 
square of the element of arc length: 

ds 2 — hi dq\ -f- hi dq\ -f- h 3 dq 3 (12.14) 

Applying (12.13) to our case, we find that 


Now we substitute (12.15) into (12.1) and by (12.3) express r 
and z in terms of u and v. After multiplying by («-f v)/2 = r 
and carrying all terms to the left-hand side we have 


dq > 2 


^r( w l£) + '^r( t ' i£)+t(i7 + t) 

+ [l+i-£(« + u) + -f(« 2 -r 2 )]ij) = 0 (12.16) 


Obviously the equation can be solved by a separation of variables. 
Indeed, if we put 

$ = U(u)V (v) e lm * (12.17) 


Eq. (12.16) is satisfied if U(u) and ^(u) are the solutions to the 
following equations: 


where 


d 1 
du ' 

f dV' 
1“ du , 

) + (“ + 1 £ “ - 

m 2 
4 u 


) U = Q 

(12.18) 

d 1 
do 1 

( dV> 
l t ' do , 

| + (s+4-£»- 

m 2 

4v 


| l/ = 0 

(12.19) 



a -f b — 1 




(12.20) 


Here m is simply the magnetic quantum number. Parameters a 
and b, connected by (12.20), are found from the condition that 
for all values of u and v from 0 to 00 the solutions to Eqs. (12.18) 
and (12.19) are finite. 
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13. Splitting of energy levels in an electric field 

Our aim is to find the corrections to the hydrogen energy levels 
for small values of the principal quantum number, n, assuming 
that g is very small. 

Let us introduce the new variables 

U\ — u{ — 2£)' /, ( o, = o(-2 E)' 1 " (13.1) 

Since parameter E is negative [its approximate value is E — 
== —l/(2n 2 )], «i and v x are real and change from 0 to oo. Equa¬ 
tions (12.18) and (12.19) in the new variables read 

+ <> 3 - 3 > 

where 


£i 


g 


(- 2 E? 1 ' 

and where parameters a x and b x are 

a x — -—n-, b x = 

so that 


(- 2 E)' 1 ' 


a x + b x — (— 2 E) 




(13.4) 


(13.5) 

(13.6) 


It follows from Eq. (13.4) that g x is a small constant only if 
the values of the principal quantum number, n, are small, a fact 
that we assumed at the start. If we consider gi known, Eqs. (13.2) 
and (13.3) are the eigenvalue equations for the hermitian opera¬ 
tors with eigenvalues a x and b x . If we find these eigenvalues, by 
using (13.6) we can find the energy eigenvalues, E. 

In Eqs. (13.2) and (13.3) we will consider the terms that 
contain g x as the perturbation. The unperturbed equation is of 
the form 

< 13 - 7 > 


which we recognize to be the one we studied in Sections 3-5. Its 
eigenvalues are 

A. = p + -L w l±l , p = 0, 1,2,... (13.8) 

and its eigenfunctions are 

y p ( X ) = x 1 m ^ ,2 e~ xl2 Qp |m| (.*) (13.9) 
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Hence in the zeroth-order approximation the eigenvalues and 
eigenfunctions of Eqs. (13.2) and (13.3) are 

a? = rt, + -^LL, rt,=0, 1, 2, ... (13.10) 

= ^ = 0, 1,2,... (13.11) 

ifn-ynM), <=*/*>.) (13.12) 

In the same approximation Eqs (13.10) and (13.11) give 

(— 2£)~ 1/l = a? + 6? = «i 4- ri 2 +1 m | + 1 = n (13.13) 

where n assumes the values n = 1, 2, 3,... and is simply the 
principal quantum number. 

All the eigenvalues of the operator in (13.7) are non-degenerate. 
Therefore to obtain the first-order correction to (13.10) and (13.11) 
it is sufficient to find the diagonal matrix element of the pertur¬ 
bation, which is — gjU?/4 in Eq. (13.2) and gii>i/4 in Eq. (13.3). 
We have already evaluated the integral \ x 2 [y p (x)] 2 dx, and 

J 0 

according to (4.19) and (4.21) 

oo 

5 x s+ V* [<# (x)] 2 dx = 6p 2 + 6p (s -f 1) + (s + 1) (s + 2) (13.14) 
o 

Hence the eigenvalues in the first-order approximation are 

Oi = rtt -f I m 1 -[6/t 2 -f- 6«i (| m | + 1) 

+ (lm|+l)(|m! + 2)] (13.15) 

b\ — n? + ^ m 1 4" [6« 2 4" 6 /i 2 (I m | -+• f) 

4-(I nil 4-D (I ml 4-2)] (13.16) 

Their sum is 

+ b, = n + jgiti(th — n,) (13.17) 

where n is determined by (13.13). Then owing to (13.4) and 
(13.6) we have the approximate equality 

(- 2£)-*'■ = n 4-1 gn (n, - «,) (- 2E)" 3/ ’ (13.18) 

To simplify matters let us change (—2E)'h in the correction term 
to its approximate value 1/ra. We have 

(— 2 E)“ v * = n 4- -f (/h> — «i) 


(13.19) 
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Solving this equation for E, we get in the same approximation 

E - — 2 k+4 ( l3 - 2 °) 

Thus the energy levels depend, in fact, only on two quantum 
numbers: the principal quantum number, n, and the difference 
n 2 — nj. For a given value of n this difference can assume all 
the values (integral, of course) from —(n—1) to ( n —1). The 
pattern of the splitting of an energy state term in an external 
electric field can be found by taking specific values of n 2 — n t one 
after another. Formula (13.20) agrees with experiments. 

We have seen that the state of the electron of the unperturbed 
hydrogen atom can be described either by specifying the quantum 
numbers n, l, m (in spherical coordinates) or by specifying the 
quantum numbers ti\,n 2 ,m (in parabolic coordinates). Since the 
eigenfunctions \| i n im do not coincide with i|5 n , n , m (the first are linear 
combinations of the second with the same values of n and m, and 
vice versa), the states nlm and ri\n 2 m differ. But then in what 
state is the electron of the unperturbed hydrogen atom? We can 
judge the state of the atom by measuring its energy E, which for 
an unperturbed hydrogen atom depends only on n. This means 
that in a state with a definite energy only this quantum number 
has a definite value, whereas the other quantum numbers (say, 
n u n 2 , m) remain not only unknown but physically undefined. 
The most we can say is that the wave function of such an atom 
is a linear combination of the eigenfunctions with the fixed value 
of n and various values of the other quantum numbers (in fact, 
this is a combination of the functions as well as the func¬ 
tions To specify the value of ti\ — n 2 we must act on 

the atom in a definite way. Namely, we must place it in an 
electric field. When this is done and the difference ri\ — n 2 is mea¬ 
sured, the state of the atom becomes known with greater accuracy. 
We can then say that here the wave function is expressed in terms 
of the eigenfunctions with definite values of n and ri\ — n 2 . Hence 
it is the physical interaction with the atom that enables us to 
distinguish between states rather than mathematical arbitrariness 
(choice of coordinates or eigenfunctions). 

14. Scattering of a-partfcles. 

Statement of the problem 

Let us imagine that there is a coordinate system with the heavy 
nucleus of an atom at its origin. A plane wave representing an 
a-particle with definite momentum p z — p > 0 and energy E 
falls onto this nucleus from the negative half of the z axis (from 
the left). The wave undergoes diffraction: the outgoing wave from 
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the scattering centre interferes in the right half of the z axis 
(z > 0) with the incident wave. This means that the a-particle 
is “partially” deflected by the scattering centre and “partially” 
passes the scattering centre undisturbed (this must be understood 
as a linear superposition of states). For the sake of clarity we 
can visualize a whole plane of scattering centres and a beam of 
a-particles falling onto it in the direction of positive values of z. 
To the left of this plane the flux of a-particles is entirely along 
the z axis in the positive direction (that is, to the right). To the 
right of this plane it divides into two parts: the passing flux of 
a-particles and the deflected flux. 

What we want is to describe this phenomenon by using the 
wave function t|> and to find the ratio of the flux of scattered 
a-particles to the incident flux as a function of the deflection angle. 

We denote the charge and mass of the a-particle by 2e and m a , 
the charge and mass of the nucleus (the target) by Ze and M, 
and put 


m a M 
m a + M 


(14.1) 


When an a-particle hits an atom of a high-Z element. Coulomb 
repulsion between the two plays the main role, whereas the 
action of the electron shell on the a-particle is negligible. We 
can therefore assume the potential energy to be 


U(r) = 


2 Ze 2 
r 


(14.2) 


The SchrSdinger equation of our problem is 

(14.3) 


and we will consider states with a definite energy equal to 



(14.4) 


where p is the momentum of the a-particle at infinity. Let us also 
assume that the angular momentum of the a-particle about thez 
axis is zero, so that the wave function does not depend on angle <p. 

At negative values of z and large values of r the wave function 
by definition must be a plane wave 

y — eUMp-nm (14.5) 


with — oo < z < 0 and r-voo. Aside from this, to make our 
solution unique we specify that no ingoing waves from infinity 
are present. 

Since the energy has a definite value, we can put 

— 


(14.6) 
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where t|)° does not depend on / and satisfies the equation 

- -Jjj- VV + = Exf (14.7) 


Before solving this equation let us simplify its coefficients. This 
is done by introducing a new unit of length, 


A 2 

r ° — 2 Ze 2 n 

The new coordinates are 

(14.8) 

x r 0 ’ y r, • 2 r, • 

r ' ~~rT < 14 - 9 ) 

We also put 

22c 2 
£ = e — 
r 0 

(14.10) 

Equation (14.7) then reads 


— jVr'Xp + — =8\J) 

(14.11) 

and condition (14.5) takes the form 


r|>° ~ exp [< (2e) 1/l z'] 

(14.12) 


at — oo < z' < 0 and r' -*■ oo. 


15. Solution of equations 

Since according to the conditions of the problem the z axis 
plays a special role, we introduce parabolic coordinates: 

u — r' + z', v — r' — z' (15.1) 

Next we use the results of Section 12, specifically Eq. (12.16), 
in which we (a) change the term +1 in the brackets to — 1 
(because in our case the Coulomb energy has the opposite sign), 
(b) put g — 0, and (c) keep in mind that \|)° does not depend 
on angle cp. We get 

The condition at infinity, (14.12), is written in parabolic coor¬ 
dinates as 

rl>° ~ exp [/ (2e)' /j (u - o)/2] (15.3) 

as v -*■ oo and for all values of u. We can satisfy this condition 
only if 

= exp [t (e/2)' /5 «] V 


(15.4) 
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where V does not depend on u but satisfies the asymptotic con¬ 
dition. 

V ~ exp [— i (e/2)%] at v -* oo (15.5) 

If we now substitute (15.4) into (15.2), we see that the former 
is a solution only if V satisfies the equation 


+ [' (f) 7 _ 1 +T ei '] v=0 

(15.6) 

Since e > 0, we can put 


(2e,)' p v = o, 

after which we obtain 

(15.7) 

d»> ( U ‘ dv x ) +[ 2 (2e)'^ + 4 yi ] K = 0 

(15.8) 

This equation coincides with the one we studied in 
Sections 7 and 8: 

detail in 

d§7 (*» ih) + (t *+~ tit) y =0 

In our case 

2 (2 ,)'/•• S ~ 0 ' X '~ V ' 

(7.4) 

(15.9) 

If for the sake of convenience we put 


b = —TT, Xy = - — b 

(2e) 1 2 

(15.10) 


by (7.6) and (7.8) we can write the solution to Eq. (15.8), finite 
at v\ = 0, as 


V = ce~ i0 ' l2 F (- tb; 1; to,) 

= ce- iv ' l2 [ 1 + fo, + ( ~ ib) ■ ) (/u.) 2 + • •. ] (15.11) 
Constant c can be found from condition (15.5), that is 

V ~ e~ iv ' n as u. — oo (15.12) 

This requires that we use the asymptotic expansion for the con¬ 
fluent hypergeometric function, F, derived in Section 8 [see 
(8.10)]. For the specific values of the parameters we have 

e ~‘ (v ' ,2 ~ bloe °' > F 20 ( ib\ - ib ; //«,) 

- - Vl Y(x- ib) «' {v ' ,2 - bloev ' ) F w (1 + ib ; 1 + lb; - t/o,) (15.13) 
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where F 2 o are formal power series built according to (8.9). Condi¬ 
tion (15.12) holds approximately if we put 

c = e- nb ‘ 2 Y{\ + ib) (15.14) 


Let us now turn to coordinates r' and z! and build the function 
tj>°. On the basis of (15.1), (15.4), and (15.7) we get the following 
expressions for t|)°. For small values oi v = r' — z' 


tj)° a e~ nb/2 T (1 





(— / 6 ) (— 6 + 1 ) , 


i (r' - z') 
b 


(15.15) 


and for large values of v — r' — z' 

$0= e i*!b e lb log [<r'-*0/&] (! + _iL__ + 

- 77 -irp- e ir lb e~ lb log (1 + ...) (15.16) 


Note that parameter b is proportional to the wavelength. 


16. The Rutherford scattering law 

Formula (15.16) provides an overall solution for our problem. 
At large distances from the atom and not too close to the z axis 
(at least at a distance of several wavelengths from this axis) the 
wave function consists of two terms. The first term, being 
approximately 

$ = e lz ' lb+lb *°a Ur'-rtm ( 16 . 1 ) 

represents the passing plane wave corresponding to the undeflect¬ 
ed beam of a-particles. The second term, approximately being 

H>2 = ~ b 2 T f {{ + gj 7 -^ e <rV&-<6lo g [(r'—*0/61 (16.2) 

is the scattered spherical wave corresponding to the deflected 
particles. 

To estimate the relative intensity of the deflected and undeflect¬ 
ed particle fluxes, let us build, using (16.1) and then (16.2), 
two quantities that are proportional to the current densities. We 
recall formulas (2.10) and (2.11), Chapter III, and get 

S, = grad'& log- 1 ) < 16 - 3 > 

S 2 = (r r grad 7 (y - b log — ~j~~ ) (16.4) 
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Obviously, vector Si at great distances is directed along the z 
axis, and S 2 along the radius vector away from the scattering 
centre. 

If we denote the number of particles crossing an area do per 
unit time by n do and the number of particles scattered into the 
area that subtends a solid angle da at the origin (the target) per 
unit time by 0 4, then by using (16.3) and (16.4) we find 

a>-«rg( 7 t^)V (16.5) 


If we denote the scattering angle by 0, then 

z' — r' cos0 (16.6) 


With this in mind we can write (16.5) in the following form: 


M( .2i 4 

tir qD 

4 sin 4 (0/2) 


(16.7) 


or, if we replace r 0 and b by their values given by (14.8), (14.10) 
and (15.10), 

® = (16 ' 8) 

Ernest Rutherford derived this formula on the basis of classical 
mechanical notions and verified it in experiments. The law of 
inverse proportionality to the fourth power of the sine of the 
scattering half-angle was fully confirmed. 


17. The virial theorem in classical 
and in quantum mechanics 


Classical mechanics contains a theorem that deals with the 
motion of a system of particles in a finite region of space. During 
this motion the particles’ coordinates and, naturally, their ve¬ 
locities remain finite. The virial theorem states that the time 
average value of kinetic energy T is related to the virial, which is 
a certain linear function of the derivatives, with respect to the 
rectangular coordinates, of potential energy U. The coefficients 
of the derivatives are proportional to the corresponding coordi¬ 
nates. For one mass point the virial is 




dU 


dx 


dU , dU 
y dy + 2 dz 


(17.1) 


with U the potential energy. If we recall that the equations of 
motion are 


mx — — 


dU 
dx • 



mz — 


dU 

dz 


(17.2) 
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we see that 

V = — m {xx -j- yy + zz) (17.3) 

On the other hand, obviously, 

jL( x * + y t + z*) = 2(xx + yy + zz) (17.4) 

-£r(x 2 + y 2 + z 2 ) = 2(x 2 + y 2 + z 2 ) + 2 (xx + yy + zz) (17.5) 
or 

-g r tf + f + z!) = ±T—lv (17.6) 

It is easy to see that the time average value of the left-hand side 
of (17.6) vanishes since it is equal to the difference of the values 
of (17.4) at two distant moments of time, divided by the time 
interval. 

Thus 

2(T) = (V) (17.7) 

This is the essence of the virial theorem of classical mechanics 
for the case of one mass point. The theorem can be generalized 
for a system of mass points. 

If the potential energy is a homogeneous function of degree p 
in the (Cartesian) coordinates, the definition of the virial, (17.1), 
gives 

V = pU (17.8) 

and hence 

2 00 = p (U) (17.9) 

Now we take up the case involving quantum mechanics. The 
time average of a classical quantity has corresponding to it in 
quantum mechanics the mathematical expectation of the quantum 
analogue of this quantity in a state with definite energy. Let us 
show that, if we relate classical and quantum quantities in such 
a way, there is a condition in quantum mechanics similar to the 
virial theorem of classical mechanics. 

The Schrodinger equation for the motion of a mass point (as 
well as for the motion of a system of mass points) can be obtained 
from the variational method. This method states that for actual 
motion 

6/ = 0 (17.10) 

where the action integral, /, is 

/ = \^(T + U-E)ipdx (17.11) 

with T the kinetic-energy operator, U the potential energy oper¬ 
ator, and £ the energy parameter (total energy). 
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We will keep to the case of one mass point. Then 


> 

I 

II 

£ 

(17.12) 

where V 2 is the Laplacian operator. 

If t]) is normalized so that 


•^T 1 

II 

(17.13) 

then 


7’o= ^ tyT\ J) dx 

(17.14) 

is the mathematical expectation of kinetic energy and 


U 0 — ^ tyUty dx 

(17.15) 


is the mathematical expectation of potential energy. 

After these preliminary remarks we can formulate the virial 
theorem as follows. 

If the wave function, ty, belongs to the discrete spectrum 10 and 
the potential energy is a homogeneous function of degree p in 
the coordinates, that is, if 

U (A,r) = X°U (r) (17.16) 

we come to the following relationship: 

2T 0 = pU 0 (17.17) 

that is, the double mathematical expectation of kinetic energy is 
equal to p times the mathematical expectation of potential 

energy. 

To prove the theorem we first change r in \|)(r) to a proportional 
quantity (r->Xr) and consider the function 

**(r, A.)«=fc' /, i|>(Jlr) (17.18) 

which according to (17.13) is normalized to unity. Next we sub¬ 
stitute t|3* into action integral (17.11) and denote by T 0 and Uo 
the mathematical expectations of the kinetic and potential ener¬ 
gies in the state described by ip*. When x, y, z change to Xx, Xy, Xz 
(that is, with the change of scale), T changes to X 2 T (normal¬ 
ization for t|>* coincides with normalization for \p). This means 
that 

To = J dx = X 2 T 0 (17.19) 


10 This corresponds to a motion in a finite region of space and with a finite 
velocity. 
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and also that 

U* 0 = J tW dx = $ ^(r) U ( j) $ (r) dr 

which owing to the homogeneity of U yields 

Ul = X~ p U 0 

The action integral is then 

I = X 2 T 0 + X~ P U 0 -E 

Finally, nullifying its variation in parameter X, we get 
6/ = (2 XT 0 - pX~ p ~ l U 0 ) 6A = 0 


(17.20) 

(17.21) 

(17.22) 

(17.23) 


But the solution of the variational problem is found at X — 1. 
Hence 

2T 0 = pU 0 (17.24) 

which is what we set out to prove. 

In the general case of an arbitrary potential energy we have 

(dU* n \ ( dU dU dU\ 

2T °= - br A_, = ME - (, x ~dz +y^ + z-di) = MEV 

(17.25) 


We can apply the same reasoning to the case of a system of 
particles if operators T and U are homogeneous functions defined 
by (17.19) and (17.21). 

The relationship expressing the virial theorem is satisfied not 
only by the exact solution but by approximate solutions as well, 
provided that these are obtained by using the variational method, 
which allows for variation of scale. Variation of scale means that 
the wave function can be transformed according to (17.18), with 
an appropriate generalization for many particles. After this trans¬ 
formation X can be found by the variational method. For instance, 
the method of self-consistent field (see Part IV), obtained by the 
variational method, allows for variation of scale. 


18. Some remarks concerning the superposition 

principle and the probabilistic interpretation of the wave function 

In Sections 14-16 we considered the problem of a particle collid¬ 
ing with the heavy nucleus of an atom. This problem is of special 
interest because it provides a graphic demonstration of the 
physical meaning of the concept of a wave function. 

We know that the wave function \J i(x,y,z,t) serves to describe 
the state of one particle. This state can be such that a given 
physical quantity does not have a definite value. For instance, let 
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the wave function of a free particle be 

^ = (c\e lx P IH -f C 2 e iyplh ) exp (— (18.1) 

In this state the particle’s energy is 

e -4c < w - 2 > 

But we cannot specify the direction of its motion. If we were to 
run a large number of tests that made it possible to determine the 
direction of motion (say, with the help of a specially oriented 
aperture), we would find that there is a probability of the particle 
moving along the x axis with a velocity v = p/m. There would 
also be a probability of the particle moving along the y axis with 
the same velocity. In other words, in the state described by wave 
function (18.1) the particle has a potential possibility of being 
observed moving in one or the other direction. Obviously, this is 
a state different from the one in which the particle moves along 
the bisector of the angle between the x and y axes. In the latter 
case a particle having a velocity corresponding to the value of 
energy (18.2) would be in the state 

= c- i e~ iEt/h exp (j - jJ- p) (18.3) 

which is completely different from (18.1). The possibility of the 
existence of states in which a given quantity does not have a 
definite value and which are linear combinations of states with 
definite values of this quantity (the superposition principle) is 
most characteristic of quantum mechanics, and herein lies the 
fundamental difference between quantum and classical mechanics. 
The language of classical mechanics cannot describe such a 
“mixed” state of one particle. The superposition principle is 
necessary, however, if we want to derive, starting from a general 
principle, the dual nature of light and matter, which manifest 
themselves as waves and as particles. 

In the early history of wave mechanics the wave function was 
interpreted as a certain wave in physical space, a wave that was 
connected with a collection of particles capable of diffraction (de 
Broglie’s wave). The superposition principle rested on this as¬ 
sumption. For instance, in (18.1) the term 

c ‘ eixp/ *ex P (-T£-0 

corresponded to a wave representing a flux of particles moving 
along the x axis with velocity v — p/m, whereas the second term 
represented a flux of particles moving along the u axis. The whole 
wave function was, so to say, the superposition of these two fluxes, 
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and their ratio was equal to the ratio of the amplitudes’ square 
moduli, |ci| 2 / 1 ^ 2 1 2 - 

Such an interpretation of the wave function, thanks to de 
Broglie, was pictorial. Strictly speaking, however, it was not a 
precise interpretation because the two fluxes can interfere with 
each other and superposition can be understood as a linear com¬ 
bination of the wave functions representing the fluxes. More than 
that, we know that a wave function describes the state of only 
one particle (and not a flux) and must be interpreted on the basis 
of the potential possibility of certain results being obtained in 
experiments (measurements) with the particle. We have mentioned 
this fact in Section 6, Chapter IV, Part I. There we also noted 
that a quantum object cannot be an element of a statistical en¬ 
semble. For this reason there is no purpose in introducing 
“ensembles” of such objects, as is done in statistical physics. If 
we want to assume that the concept of probability requires the 
existence of an ensemble, this can only be an ensemble of the 
results of a definite set of experiments. 

To return to the wave function (18.1). If we consider it from 
the angle of potential possibility, we can build formulas for the 
probabilities of finding a particle moving along the x and y axes, 
provided that initially the particle was in state (18.1). The prob¬ 
abilities will then be proportional to the squares of the amplitudes 
moduli, and their ratio will be equal to |ci| 2 /|c 2 | 2 . This result 
coincides with what we get when we consider de Broglie’s waves 
as referring to fluxes of particles. But we should not understand 
the word “fluxes” too literally. We must allow for the possibility 
of mutual destruction of the waves that represent the fluxes as 
the result of interference of the waves. 



Part III 


PAULI S THEORY OF THE ELECTRON 


1. The electron angular momentum 

In Section 7, Chapter II, we considered the operators of angular 
momentum, which we built using the position and momentum 
operators: 

m x — ypz — zp y 
m y = zp x — xp z 

Hz = Xpy — yPx ( 11 ) 

These operators can represent the angular momentum of a mass 
point with three degrees of freedom, that is, moving in space. 
The behaviour of an electron in a magnetic field and the prop¬ 
erties of systems of many electrons (the electron shell of an 
atom, for instance) show that the electron possesses an internal 
degree of freedom connected with its intrinsic angular momentum, 
which does not depend on the motion in space. This internal de¬ 
gree of freedom (or the corresponding intrinsic angular momen¬ 
tum) of the electron is called the electron spin. 

The properties of the electron’s intrinsic angular momentum 
(spin) can be studied by using the commutation relations be¬ 
tween the components of the conventional (orbital) angular mo¬ 
mentum 

m y m z — m z m y — ihm x 
m z m x — m x m z — ihm y 

m x m y — m y m x — ihm z (1.2) 

We assume here that spin angular momentum satisfies the same 
relations. We also introduce the hypothesis that the operators for 
each of the spin components possess only two eigenvalues that 
differ only in their signs. 

The operators of spin angular momentum components can be 
written as 

fj fi Ti 

( rn x) S pi n — a *’ ( m l/hpln ~ ~2 °y’ ( m z)spln == "jf 0-3) 
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where o x , o y , o z are hermitian operators with two nondegenerate 
eigenvalues, +1 and —1. The '^-factor appears because opera¬ 
tors (1.3) must satisfy (1.2). 

When calculating spin in units of fi (and not in units of fi/2), 
it is convenient to use the operators 

l l l 

S x — 2 — 2 2 

instead of o*, a y , o z . We will use these new operators in Part IV 
to study the many-electron problem. 

The commutation relations for o x , a y , o z are 

O yO^ (J y — 2 ICt X 

OX O x O 2 2 tefy 

o x Oy — o y o x = 2io z (1.4) 

At the same time there is the following identity: 

a l°z - (p x o, - o z o x ) + (o x a z - 0,0 x ) o x 

which together with (1.4) yields 

a x a z “ a z a l = — 2 / ( a x a y + a y a x) 

According to our hypothesis, the eigenvalues of a x are ±1, so 
that o 2 has only one eigenvalue <J* = 1. This means that o 2 x is 
not an operator but simply a number, which commutes with any 
operator, a z inclusive. Hence the right-hand side of the last equal¬ 
ity vanishes. Together with similar equalities for other components 
this yields 

a y a z + o z o y — 0 
o z o x + o x a z = 0 

O X Cfy + OyO x =0 (1.5) 

which are called the anticommutation relations (the operators in 
this case are said to anticommute). If we compare (1.4) with (1.5), 
we can see that 


OyO ' z * OgOy - /(Tjf 


&z&x ® x&z == IGy 


(TjjO’p === & x — 

(1.6) 

A 

II 

A 

II 

A 

n 

►—* 

(1.7) 


We can consider a x , o y , o z either as 2 X 2 matrices or as opera¬ 
tors that act on functions of a certain new variable o (in addition 
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to coordinates) that assumes only two values, o = ±1, say. De¬ 
noting a function of this kind by ^(r, o), where r represents all 
three position coordinates, we can satisfy relations (1.6) and 
(1.7) if, for instance, we put 

a x y\> (r, a) — ^ (r, — o) 

(r. o) = — /ait (r, — a) 

a 2 4|>(r, or) = aty (r, a) (1.8) 

If we write as a two component column function 

+-Q) <‘- 9 > 

where g = (r, +1) and rj = i|)(r,—1), then the previous for¬ 

mulas are written 


<)<)■ 

■w:)- 

O-U) 

(1.10) 

Hence the operators can be represented 

in matrix form as 



O x — or,, Oy — 0 * 2 > 

<y z = <*3 

(1.11) 

where 




(0 1\ 


/I 0\ 


01 ll oj 

’ a ^{i oj’ 

° 3== \0 -ij 

(1.12) 

These are the 

Pauli matrices. The 

choice of matrices 

for op- 


erators o x , o y , o z is determined to within a canonical transforma¬ 
tion (which corresponds to a linear combination of £ and q). 
Whence, if we assume, as is customary in scientific literature, that 
oi, 02 . 03 are matrices with numbers as elements, (1.12), and pre¬ 
serve the physical meaning of a x , o y , o z that follows from (1.8), 
then equalities (1.11) are not the only possible ones and can be re¬ 
placed by equivalent equalities. 

The operators o x , o y , a z , which satisfy (1.6) and (1.7), are of a 
vector nature in the sense that, if we introduce three linear forms 


< = / i a x + m i a u + n i°z 
°' u = l 2° X + m 2° y + 

°'z = h° X + m 3° y + n 3°z 0 - 13 ) 


where /*, m*, n* (k — 1,2, 3) are the cosines of the angles between 
two rectangular coordinate systems, the new operators,o', o', 
will have the same properties (1.6) and (1.7) as the old operators, 
a x , o y , o z . It then follows that if we consider the three quantities 
to be the components of a vector, the eigenvalues of the projection 
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of this vector on any direction are ±1. Note that when we assume 
operators o x , o y , o z to be matrices (1.12), as was done in (1.11), 
then (1.13) are the most general relationships for 2 X 2 matrices 
that satisfy (1.6) and (1.7). 

The three matrices (1.12) together with the unit matrix form a 
complete set in the sense that an arbitrary 2X2 matrix can be 
expressed in terms of a linear combination of these four matrices 
with coefficients that are complex numbers. If the matrix under 
consideration is hermitian, the coefficients are real. 

Now we turn to the problem of total angular momentum op¬ 
erators. If we take the operators of spin angular momentum in the 
form (1.3), the operators of total angular momentum are obviously 

^x == m x "j - &x> y === ftiy H 2 &!/’ z == m z o z (1.14) 

Owing to the general property of angular momentum, operators 
(1.14) must satisfy the same commutation relations as the op¬ 
erators of orbital and spin angular momenta. This results in the 
following relations similar to (1.2): 

Jt y jK z — Jfc z M v — ihJC x 

\JfC Z JK. x jg x ./IT 2 ihJ(y 

Jl x jKy — Jl u Jli x — ihJ( z (1.15) 

These commutation relations can be verified directly by using 
relations (1.2) and (1.4) for m x ,m y ,m z and o x ,o y ,o z respectively 
and remembering that orbital angular momentum commutes with 
spin angular momentum. 

We can use the components of the spin and orbital angular 
momenta to build a bilinear form that commutes with each com¬ 
ponent of the total angular momentum. Indeed, let us put 

JK. — a x m x + a u m y -f a z m z + ft (1.16) 

or, which is the same, 

JK- — o x -f- o v JK. y -f- cr Z JH z —g" A (1.16*) 

According to the properties of m x , m y , m z , 

J(m x — m x Jf = — ih (o y m z — a z m y ) 
whereas according to the properties of a x , o y , q z 

~2 (dCox c 1 X xJfC) — ih {o y m z <J z m y ) 

If we then add the two equalities, the right-hand side vanishes. 
The expression on the left together with the similar relations for 
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the other two components yields 

JIJL X -JH X JL = 0 
J(A y — jffyjg = Q 

jHjK z — M 2 JH = 0 (1.17) 

Let us see how operator Jl is connected with the operator of 
orbital angular momentum in Schrodinger’s theory. We find that 

Jl 2 — J(h = m? x + rn\ + tn\ (1.18) 

On the other hand, we can square operators corresponding to the 
orbital angular momentum components, (1.14). We then see that 

+ — jh 2 (1.19) 

Thus the square of vector (Jl x , J[ y , Jl z ) differs from the square 
of the scalar operator Jl only by the term fi 2 /4. We will call Jl 
the operator of the spin-orbit scalar of angular momentum. 

The right-hand side of (1.18) is an operator in Schrddinger’s 
theory and does not contain the Pauli matrices. Its eigenvalues 
are h 2 l(l + 1), where / is a positive integer or zero. If we denote 
the eigenvalues of Jl by hk, we have 

k(k- l) = /(/ + 1) (1.20) 

which for a given value of l yields 

k = -l or k = l+\ (1.21) 

But k cannot be zero. Indeed, from formula (1.19) it follows 
that the mathematical expectation of Jl 2 exceeds h 2 /4 in any state. 
Hence the eigenvalue equation 

Jl^ — kh^ (1.22) 

cannot have a zero eigenvalue. This in turn means that at / = 0 
the only possible value of k is k — 1, and at l ^ 1 there are two 
values of k, given by formula (1.21). 

2. The operators of total angular momentum 
in spherical coordinates 

When we studied the problem of motion due to a central field 
in Schrfidinger’s theory (Chapter IV, Part II), we found the 
expressions for the operators of orbital angular momentum in 
spherical coordinates r, 0, qp. These coordinates are connected with 
rectangular coordinates x, y, z via the following formulas: 

x — r sin 0 cos qp, y — r sin 0 sin <p, z — r cos 0 (2.1) 
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If we put 

Pr= itl -qP > Po — ^0 ' P<f~ ( 2 - 2 ) 

then according to formulas (3.2), Chapter IV, Part II, we have 
m x = — sin cp p 9 — cot 0 cos <p p 9 
trig — cos <p p 0 — cot 0 sin tp p<p 
tn z — p 9 (2.3) 

The simultaneous eigenfunctions of the square of angular mo¬ 
mentum and the projection of angular momentum on the z axis 
must in Schrodinger’s theory satisfy the following equations: 

“ A 2 [dbr w ( sin 9 ler) + i w] 

= ft 2 /(/ + l)i|> (2.4) 

m z \(i = — M 4{- = «*♦ (2-5) 

and must be single-valued on the surface of a sphere. 

Now we consider the total angular momentum, which includes 
spin. Let us reintroduce the operators jK z and M of the previous 
section [see (1.14) and (1.16)]. In terms of pe and p<p these op¬ 
erators are 

+ ( 2 - 6 ) 

J[ — (— a x sin <p + a y cos qp) p 6 

+ (— a x cot 0 cos <p — o y cot 0 sin <p + a z ) p 9 -\-h (2.7) 

We ask for the simultaneous eigenfunctions of Jt z and Jt. 
These functions must satisfy the equations 

Jt z ^ — h[m + ( 2 . 8 ) 

jK\ J) = hk ip (2.9) 

and must be single-valued on the surface of a sphere. 

As to the eigenvalues of the two operators, we have already 
stated that for a given nonzero value of l the quantum number k 
can assume the values k = —/ and k — l + 1, whereas at / = 0 
the only possible value of k is unity. The quantum number m is 
the same as in Schrodinger’s theory. It can assume all integral 
values from m — —l to m = +1. 

To simplify (2.6) and (2.7) let us use the canonical transfor¬ 
mation 


L' = SLS + , 


i|/ = Stp 


( 2 . 10 ) 
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where 

5 = cos y + ia z sin , S + = cos-|- —■ fa z sin-|- (2.11) 

If prior to transformation 

a x = a u o y = o 2 , o z = a 3 ( 2 . 12 ) 

■** = -**-£-+ T ff 3 (2.13) 

then after transformation 

o' = a, cos qp — a 2 sin (p, a' = a 2 sin <p + o 2 cos <p, o' = cr 3 (2.14) 

^ = ~ ih -k (2 ’ 15) 

Changing the operators a Xt a y , a z in (2.7) according to (2.14) 
to °*> °' y > °z and P v t0 

P' 9 = SP<p S+ = P V -J ° 3 ( 2 - 16 ) 

we find the transformed operator Jl in the form 

JC' = — ai cot Qpy + o 2 (p 0 — y- cot e) + a 3 p 9 + I- (2.17) 

where p tf and pe are understood to be the operators (2.2). 

To further simplify Jt we apply to it two transformations one 
after another. First, we put 

M" = T J['T + , a|>" = 7Y (2.18) 

where 

r = cos+ /o 2 sin y* 7’ + = cos — *cr 2 sin y (2.19) 

We get 

J( " = -WWP* + a * (^e ~ T cot 9 ) (2-20) 

Second, we put 

JT = ( S in 0)' /j JL" (sin 0) _,/ \ = (sin 0) v ’ V (2.21) 

The transformed operator Jl * now assumes the form 

+ {2 - 22) 

which is much simpler than the initial form, (2.7). Operator J[' z , 
on the other hand, has not changed after transformations (2.18) 
and (2.21), so that we have 


Jtz = Pv 


( 2 . 23 ) 




Pauli’s Theory of the Electron 


239 


If we consider ^ to be a two component wave function on the 
surface of a sphere [see (1.9)], 

(124) 

and if we put 

M>l 2 = H>i l 2 + M>21 2 (2.25) 

we can take the normalization condition to be 

5t 2jl 

4 ^ 5 5 I M 5 i 2 sin 0 rfe = 1 ( 2 . 26 ) 

o o 

The same will hold for and t|)". But for ty*, which differs from 
ip" by the factor (sin0) ,/! , the normalization condition is 

n 2n 

5 | op* P rf<p = 1 (2.27) 

o o 

Only the original wave function is required to be single-valued. 
As for the transformed wave functions, they change their sign 
when <p increases by 2rc since the operators of the transformations, 
(2.11), are linear in sin (tp/2) and cos (<p/2). Hence the trans¬ 
formed functions are two-valued functions of position coordinates. 

The eigenvalue equations (2.8) and (2.9) for Jl z and Jl then 
read 

P = h (m + -^-) i|)* (2.28) 

— -^70 Apf + = tlW (2.29) 


3. Spherical harmonics with spin 

If we write the two component wave function as 




-G) 


we see that Eq. (2.28) splits into two similar equations 

= + |f = i( m + ±)z 


dtf 

and Eq. (2.29) reduces to the system 
l dZ dZ ,,, 

- W =kY ’ 


_L .dY_A.dY.- k 7 

sin 0 a<p ^ ae ~ 


(3.1) 


(3.2) 


sin 0 dqp 


(3.3) 
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If we want the solutions to satisfy (3.2), we put 
n 0 , <P) = TTTTT e ‘ V A (0) 


Z(0, qp) = 


(4ji) 
I 


(4n)' /l 


e l (m+'W <p£ (9) 


(3.4) 


Then A(Q) and B(0) will satisfy the following system of equations: 


dA 

dO 


^ — (3.5) 


dQ 


sin 0 


and the normalization condition will be 

n 

±\(A 2 + B 2 )dQ=\ (3.6) 

o J 

These last functions can be expressed in terms of the associated 
Legendre polynomials of the first kind, PT (cos 0), studied in 
Schrodinger’s theory (see Sections 4-6, Chapter IV, Part II). Let 
us recall some of their properties. The function 

P?(x) = P?(cos 0) (3.7) 

satisfies the equation 

17 T ^- F P " + / ( / + 1 ) Pr==0 < 3 - 8 ) 

and is the solution that remains finite at x = ±1. At m = 0 the 
function Pf (x) reduces to the Legendre polynomial 

= < 3 - 9 > 

and at m ^ 0 it is 

P?(x) = {\ - x 2 T' 2 Tpr Pi (x) (3.10) 

For negative values of m the PT (x) are expressed in terms of the 
associated Legendre polynomials with positive values of m: 

PT m (x) = (-l) m Pf (*) (3.11) 

We also know [see (5.18) and (5.19), Chapter IV, Part II] that 
Pf(cos0) and Pf +l (cos0), considered as functions of 0, satisfy 
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the following system of first-order equations: 

-jq P? (cos 0) — m cot 0 P? (cos 0) = — P? +l (cos 0) 

-jQ-P? + ' (cos 0) + (m + 1) cot 0P T +l (cos 0) 

«=(/ + m+l)(l -m)P? (cos 0) (3.12) 

Equations (3.5) reduce to this system. We can show this by 
initiating a transformation 

A + iB = (sin 0) v ' e~ m (y x + iy 2 ) (3.13) 

analogous to the one from \|)* to t|»' given by (2.18) and (2.21) [we 
consider A, B, y u and y 2 in (3.13) to be real]. Then the equations 


for yi and y 2 are 

— m cot 0 y x — (k + m) y 2 

^ + (m+\)co[Qy 2 = — (—k + m+\)y l (3.14) 

These coincide with the equations for ordinary spherical harmo¬ 
nics, (3.12), if we put 

yi = -c(k + m) PT, y 2 = cP? +l (3.15) 

or 

y, = - c'PT m , y 2 **c {-k + m+\)PT m ~ l (3.16) 

and if we assume, following (1.20), that 

k(k- l)=/(/+ 1) (3.17) 

We can take l to be either of the two numbers —k and k — 1 that 
is nonnegative. This condition can be written 

/ +7 2 = I* —7*1 (3-18) 

Equating (3.15) with (3.16) and using formula (3.11), we get the 
ratio of the constants c' and c: 

■f = (-l) m (k + m)%±^ (3.19) 

The value of each constant can be found from the normalization 
condition (3.6), which can be rewritten thus: 

Jl 

tS (y\ + yl)^QdQ = l (3.20) 

0 


16—2186 
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We get 


C = * / (/ — m) I \ '/i / _ /_ j ^4-m / (< + m )\ \'/» 

|£ + m| v ’ V (/ -4- m )f ^ V 7 | ft + m l'^ 1 \ (/ — m)l / 

(3.21) 

If we introduce the normalized spherical harmonics by (6.7), 
Chapter IV, Part II, 


where 

then 
I/i = “ 


P\ m (x) = (21 + l) ,/j PT to, * = cos 6 (3.22) 

+ 1 n 

Y \ [P? m to ] 2 dx= {.Pf 1 (cos 0 )] 2 sin 0 d0 = 1 (3.23) 

-1 o 


k + m 


[(* + m) (2* - l)] Vl 


rTW Pi " y * = ( jL ir=T i y pr *' < 3 - 24) 


where the square roots must be taken to be positive. Bearing in 
mind the relationship between k and l (3.18), we write these 
formulas as 


= y*=( k 2 k-l k >° (3-25) 

yy = {-w=rT p ~ k ' y^{ k u m -~ x x )' / >-V 1 , *<o (3.25*) 

These formulas demonstrate that the functions y with negative k 
can be expressed in terms of functions with positive k: 

yi(—k, m, 6 ) = — ( k ^ | ) U 0 ! (k + 1 , m, 0 ) 

y 2 (- k, m, 0) = ( *!!!+' ) Vi 02 (* + 1, m, 0) (3.26) 

We list some spherical harmonics with spin for various values 
of k : 



11 

+ 

**** 

11 

0) 

m = — 1 

01 = o 

02= 1 

m = 0 

</. = -! 

o 

11 

CM 
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* = (/ = 1) 

m = — 1 

m = 0 

y\ — — sin 0 
yi = cos 0 

y 2 = cos 0 
yt = sin 0 

* = + 2 (/ = !) 

m = — 2 

m — — I 

m — 0 

m — I 

y i = 0 

yt = —7=- sin 0 

V2 

y i = — VsT cos 0 

yi = — Sin0 

y *=- 'sjj sin 6 

yt — V2 cos 0 

yt — 4= sin 0 

V2 

yt = o 


4. Some properties of spherical harmonics with spin 

If we write Eqs. (3.5) as 

= (4.1) 

then these can be interpreted as the eigenvalue equations for the 
hermitian operator 

g> _ in A _ + V») „ (a o\ 

— ia 2 dQ sin0 <T| (4.2) 

with the eigenvalue being k. It then follows that functions A and B 
are orthogonal, that is 

JI 

$ [A ( k , 0) A (k 9) + B ( k, 0) B 0)] dQ = 0, k # k' (4.3) 

o 

and constitute a complete set. 

Now let us shift from A and B to y x and y 2 via (3.13). The 
latter then also constitute a complete orthogonal set. Recalling 
their normalization condition, (3.20), we can write 

JI 

\ \ [y\(k, m, 0) y x {k\ m, 0) + y 2 {k, m, 0) y 2 (km, 0)] sin 0 dQ = t> kk ■ 
“ (4.4) 


16* 
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or simply 

n 

y (k, m, 0) y (k' t m, 6) sin 0 d0 = b kV (4.5) 

o 

where y denotes the set of two functions, y\ and y<z. 

An arbitrary pair of functions ui(0) and m 2 (0) which we denote 
by one symbol «(0), can be expanded (after certain conditions 
are met) in a y(k, m, 0) -series: 

u(Q)=Zc(k)y(k, m, 0) (4.6) 

ft 

Actually, (4.6) is two expansions: 

« P (0) = £ c (k) y p (k, m, 0), p = 1,2 (4.6*) 

where the expansion coefficients, c(k), are the same in both cases. 
These coefficients can be found via the integral 

JI 

c(fe) = Y ^ y(k, m, 0)«(0)sin0d0 (4.7) 

o 


which in explicit form is 

71 

c (£) = -j $ [tfiAk, m, Q)u y {Q) + y 2 (k, m, 0)« 2 (0)] sin 0 d0 (4.7*) 

o 

With further applications in mind we put 

u(Q) = y {k 0 , m, 0) cos 0 (4.8) 


To evaluate integrals of type (4.7*) we express y(k,m,Q) in 
terms of ordinary spherical harmonics via (3.25) and (3.25*) and 
then use the recursion relation (6.11), Chapter IV, Part II. In 
fact, only three expansion coefficients c(k) are nonzero, which 
means that (4.6) contains no more than three terms. If we write k 
instead of k 0 , this expansion will be 


y{k, m, 0) cos 0 = — y(—k, m, 0) 


[(fe + m) (k-m- l)]'/* 
I 2* — 11 


y(k—\, m, 0) 


[(*-«) (fe + m+l)r / » 
12ft + 11 


y(k+l,m,Q) (4.9) 


[We can verify this by expressing y(k,m,Q) in terms of ordinary 
spherical harmonics.] 
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Now suppose that 

u(Q) — y ( k 0 , m — 1, 0) sin 0 (4.8*) 

in contrast to (4.8). After a similar procedure we find the expan¬ 
sion for (4.8): 

y{k, m— 1, 6) sin 8 — 2 y (— k, m, 6) 

+ - 1 - + - tk +? +m)1 * y {k +1, m , e) < 4 - 10 ) 

This can be verified by using the recursion relation (6.12), Chap¬ 
ter IV, Part II, of which (4.10) is a generalization. 

We note that (4.9) and (4.10) hold not only for y\ and y% but 
for functions A and B as well, since they are interconnected 
through a linear transformation. The coefficients of this transfor¬ 
mation do not depend on either k or m. 

The method by which we derived the recursion relations, using 
the completeness condition, is sufficiently general. It can be 
applied for ordinary spherical harmonics and generalized La- 
querre polynomials, considered in Part II. 


5. The Pauli wave equation 

If a particle of mass m and electric charge — e is in an electro¬ 
magnetic field with vector potential A x , A y , A z and scalar potential 
<D, the Lagrange function (Lagrangian) of nonrelativistic classical 
mechanics is 

2 = 4 m < i2 + y* + + * A « + iA z) + e® ( 5 - 1 ) 

The generalized “momenta” conjugate to coordinates x, y, z are 

dSE dSB 62? , c n . 

px ~ dx ' p y~ dy 1 Pz ~ dz ( 5 ‘ 2 ) 

These do not coincide with the components of momentum 

P x = mx, P u — my, P z = mz (5.3) 

but are related to them via the formulas 


■P x — ^r A. 


Pu — Py c A «’ 


P: = Pz-jA x (5.4) 


The particle’s energy is 

E = xp x + yp y + ip z -Z = \m{i*+ ? + ?)-<& (5.5) 




246 


Fundamentals of Quantum Mechanics 


Expressing (5.5) in terms of generalized momenta, we get the 
Hamiltonian function of classical mechanics 

h - m [(*■+f ■ A *y +(»>.+ 7 a -y +('’■+7 <n - <*> <*•«> 

In the absence of a magnetic field the vector potential can be 
taken to be zero, which transforms (5.6) into 

H = 1S -(Pl+Pl+Pl)-#*> (5-7) 

or 

h= ^t(pI+pI +pI) + u (*. y> *) (5.8) 

where 

U— — ed) (5.9) 

is the particle’s potential energy. 

We have already seen that in Schrodinger’s theory the Hamil¬ 
tonian can be obtained from the classical Hamiltonian function 
by substituting for the generalized momenta p x , p u , p z the op¬ 
erators 

P* = ~ ih lk' Py = ~ ih Ti’ p * = ~ ih -k ( 5 - 10 > 

What happens when we deal with spin? We can build the op¬ 
erator 

P = o x p x + o y p y + a zPz (5.11) 

and use it in building the Hamiltonian. 

Next we find the anticommutator of the operator 

A — a x m x + o y m y + o z m z -f h (5.12) 

[see (1.15)] and P. The two operators anticommute, that is, the 
anticommutator equals zero: 

AP + PA = 0 (5.13) 

To prove this we use properties (1.6) of a x ,o y ,Oz and the 
obvious commutation relations 

m y p 2 — p z m y = p y m z — m z p y = ihp x 
m z p x — p x m 2 = p z m x — m x p z = ihp y 
m x p y — P y m x = p x m y — m y p x = ihp z (5.14) 

Here we will not proceed with the calculations but will note that 
they simplify considerably if we use spherical coordinates rather 
than (Cartesian) rectangular coordinates. This will be done in 
Section 6. 
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If we assume, as is done in Schrodinger’s theory, that an elec¬ 
tron has only those degrees of freedom that are associated with 
the motion of a mass point in ordinary three-dimensional space, 
then substituting the operators into (5.8) unequivocally brings us 
to the already studied Schrodinger expression for the Hamiltonian. 
But if for an electron we introduce new degrees of freedom, na¬ 
mely, the ones connected with spin, then there are new possibi¬ 
lities of transforming from quantities of classical mechanics to 
quantum mechanical operators. 

Using the properties of the Pauli matrices a x , a y , a z and the 
commutativity of p x , p y , p z , we come to the following expression 
for the Hamiltonian (5.8): 

H = (°xPx + OyPy + a z p z f + U (x, y, z) (5.15) 

Hence introducing operator P, (5.11), gives nothing new. It is 
different when there is a magnetic field. Then the classical Ha¬ 
miltonian function has the form (5.6) and the generalized mo¬ 
menta do not coincide with the components of momentum but are 
connected via (5.4). We write these as 

P x = Px+-jA x , Py = p y + ±A y , P z = p z +jA, (5.16) 


If we consider these quantities as operators, they are not com¬ 
mutative. The commutation relations for them are 


4<v>,-pa>=4( 


4 (P,P,-P,r ,)=4 ( 


dA z 

dAy\ 

dy 

dz ) 

dA x 

dA z \ 

. dz 

dx ) 

'dAy 

dA x \ 

< dx 

dy ) 


'V0 

<™z 


(5.17) 


where 36 x ,36 y ,^ z are the magnetic field components. Because of 
this with the magnetic field the transformation from operators p 
to operators P gives different results depending on whether we 
initiated the transformation in (5.8) or in (5.15). If we shift from 
p x , p y , p z to P x , P y , P z in (5.8), we return to (5.6), which we de¬ 
note by H°: 


H °-T^\{p* + 7 Ax ) +(Py+7 A »)* 

+ (p z +jA z y]-e <D (5.18) 


But if we do this in (5.15) and then use the relations (1.6) for 
matrices a x ,a y ,a z and the commutation relations (5.17), we arrive 
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at the operator 

H* = H° + n° (o x 3@ x + OySfty ■+ o z 3@ z ) (5.19) 

For the sake of brevity we have put 

= = 0.927 X 10" 20 erg G" 1 (5.20) 

This can be considered as the magnetic moment of the electron 
(the Bohr magneton). 

Hamiltonian (5.19) is the generalization of the Hamiltonian in 
Schrodinger’s theory when there is a magnetic field acting on the 
electron (we ignore relativistic corrections). We will call H* the 
Pauli operator, and the wave equation 

//> = //>!*. (5.21) 

the Pauli wave equation. 

6. Operator P in spherical and cylindrical 
coordinates and its relation to JK 

Introduction of the spin dependent operator 

P = O x P x + OyPy + OzPz (6.1) 

is an important point in transferring from the Schrodinger equa¬ 
tion to the Pauli equation. Let us study the interrelation of the 
Pauli operator with jH., the operator of the spin-orbit scalar of 
angular momentum. In Section 5 we established that these two 
operators anticommute [see (5.13)]. To study this interrelation 
more closely it is convenient to transform both operators to 
spherical coordinates. For jH we did this in Section 2. As for P, 
we will proceed in two steps: first we transform P to cylindrical 
coordinates, and then we transform the result to spherical coor¬ 
dinates. 

Since vector potential A x ,A y ,A z is covariant, 

P x = Px + ^A x , P y = p u + ^Ay, Pz = Pz + ^A 2 (6.2) 

transform in the same way as p x , p y , Pz. Hence it is sufficient to 
transform only p x ,p y ,p z (assuming, for the time being, that the 
vector potential is zero) and then include A x , A y , A z in the final 
formulas. 

Let us introduce the cylindrical coordinates, p and cp, in the 
following way: 

x = pcos<p, p = psin<p, z — z (6.3) 
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The partial derivatives of ip with respect to both new and old 
coordinates are related thus: 


whence 


d<f 

dx 


d \b 

= C0S< P-af- 


sin q> dip 
P <?<P 


dill 


= sin qp 


dip 

dp 


+ 


cos <p dip 
P 3<P 


sin qp 

Px*= cos q>p p - 

. . cos qp 

p y — sin qpp p H p<p 


whereas p z does not change at all. 
Next we write P as 


P — OiPx + o 2 Py + a 3 p z 


(6.4) 


(6.4*) 


(6.5) 


with oi, o 2 , 03 being the matrices (1.12). Substitution of (6.4*) 
into (6.5) yields 

P = (<Ti cos <p + o 2 sin qp) p p 

+ j (— Ot sin qp + a 2 cos qp) p v + o 2 p z ^ 6 ' 6 ^ 

The last expression can be considerably simplified if we choose 
an appropriate canonical transformation 

i|/ = Sip, p' = SPS + (6.7) 

We note that in Section 2 we used this transformation in studying 
the angular momentum operators. There we introduced the ma¬ 
trices 

S = cos + icr 3 sin , S + = cos — io 3 sin y ( 6 . 8 ) 


With the help of these we can express the coefficients of p p , p (f , p z 
in ( 6 . 6 ) in the following way: 

<j\ cos cp + ® 2 sin <p = S + aiS 

— <Tj sin qp + cr 2 cos tp — S + a^ 

03 = S + a 3 S (6.9) 

[the last formulas are equivalent to (2.14)]. If we apply the 
transformation to operator ( 6 . 6 ) and put 

Pp ==5 Pp S+ » P' V = S P V S + , Pz = Sp 2 S + (6.10) 

we can write 

p ' ==CT ^p + 7 a X +V* 


(6.11) 
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Since matrix S does not contain coordinates p and 2 , it com¬ 
mutes with p p and p z , so that these operators do not change. As 
for operator p', we found it in Section 2 [formula (2.16)]. Apply¬ 
ing this result, we get 

Pp=P P . K = P v -T a y Pz = P* ( 6 - 12 ) 

Hence operator P transformed to cylindrical coordinates is 

P' = °lPp + 7 CT 2 (p<P — Y CT 3) + °3pz (6.13) 

or, after we change <r 2 a 3 to ia 1 , 

P' = Oi (p p ~ - 0 ) + 7 a 2 P<p + o 3 p z (6.14) 

To account for the vector potential we change p p , pq>, p z in (6.14) 
to 

Pp = P p + ^A p , P v = P<t + jA v , P z = Pz + -jA z (6.16) 


where A p , A v , A z are the generalized components of the vector 
potential and satisfy the relationship 

A x dx -j- A y dy + A z dz — A p dp + A v dtp + A z dz (6.16) 

We now find P in spherical coordinates. The transition from 
the cylindrical to the spherical coordinates 

z = r cos 0, p = rsin0 (6.17) 


is performed according to the formulas similar to those used in 
transferring from rectangular to cylindrical coordinates. By anal¬ 
ogy with (6.4) and (6.4*) we get 


dip n dip sin 0 

-r t - = COS 0 -7- 

dz or r 


dip 

d0 


whence 


dip . 

= sin 0 - 

dp 


cos 0 dip 
r ~W 


n sin 0 
cos 0 p r - 5 —p e 


(6.18) 


. Q , cos 0 
Pp — sin 0 p r -]— 2 ~Pe 


(6.18*) 



(6.19) 
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Now, to simplify this we introduce a canonical transformation 
similar to (6.7): 

r|/' = T\P" = TP'T + (6.20 

where matrices T and T + are 

T = cos ~2 "4* 1&2 sm , T — cos — tcr 2 sin y (6.21) 

The coefficients of the first three terms on the right of (6.19) can 
be expressed as 

ori cos 0 — a 3 sin 8 = T + cs x T 
a 2 — T + o 2 T 

(T| sin 0 -f ct 3 cos 0 = T + o 3 T (6.22) 

and oi in the last term as 

— T + (cr, cos 8 + ^3 sin 0) T (6.22*) 

The new operator P" then reads 

P " =T- P'l + Tie K + - IFire 0 + si" 8) (6-23) 

where 

p”^TpJ + , P ; = TpJ + , p" = Tp r T + (6.24) 
Since T contains only one coordinate, 0, we have 

Pr=Pr (6-25) 

whereas 

K = Pe~ ihT = Pe ~ T a * ( 6 -25*) 

Substituting (6.25) and (6.25*) into (6.23), we get the final for¬ 
mula 

p " = ^(Pe - i 4 cot 0) + p, + o 3 ( Pr - - 7 -) (6-26) 

We note that operator P in both cylindrical and spherical coor¬ 
dinates acquires a somewhat simpler form if we effect the trans¬ 
formations 

rj)* == for cylindrical coordinates (6.27) 

■vj)* = r (sin 0)' A \|/' for spherical coordinates (6.28) 

Each of these corresponds to a specific transformation of P: 

p* = p'Ap'p-V, (6.29) 

P* = r (sin 0)‘ A P"r _1 (sin 0) _,/ ’ (6.30) 
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After we make these transformations we get 

P* = °iP P + -y- P 9 + o 3 Pz (6.31) 

for cylindrical coordinates, and 

P* = - 7 L Pe + 7 ^eP«p + cT3p r (6.32) 

for spherical coordinates. 

We can also show that transformations (6.27) and (6.28) sim¬ 
plify the expressions for the probability of the electron’s coor¬ 
dinates being within given limits. What we mean is that the pro¬ 
bability for the inequalities 

p < p' < p -f dp, <p < q>' < <p + cf«p, 2 < z' < z -f dz (6.33) 
is 

11 )/ P p dp dcpdz — 11 ( 5 * P dp dy dz (6.34) 

and the probability for the inequalities 

0 < 0' < 0 + d0, <p < <p' < <p -f d(f, r < r' < r + dr (6.35) 
is (with a new value of \p*) 

| P r 2 sin 0 dQ dtp dr = 11 ) 5 * p dQ dq> dr (6.36) 

Now we are prepared to express P in terms of the operator of 
the spin-orbit scalar of angular momentum, Jt. This is most 
easily done in spherical coordinates. Both operators were subjected 
to the same transformations: the transformation matrices S and T 
are the same for both operators [see ( 2 . 11 ) and ( 6 . 8 ) for S and 
(2.18) and (6.21) for T], We can thereby deal with the transformed 
operators, P* and jK*\ 

i,= -Se^ + ^ (6-37) 

We can easily see that 

P‘ = <T 3 (pr + /4-) ( 6 - 38 ) 

But the properties of matrices ai and o 2 imply that J[* anticom¬ 
mutes with 03 . For this reason the equality 

P* — (pr — i ~7~) <*3 (6.39) 

also holds. Here matrix 03 can be interpreted as the radial com¬ 
ponent of the spin. Indeed, if we put 

O r = y(xO t + y02 + Z 0 3 ) 


(6.40) 
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we will have 

TSa r S + T + = a 3 (6.41) 

so that after we have used in our formulas transformations S 
and T, matrix a, becomes 03 . Hence prior to the application of 5 
and T the formulas corresponding to (6.38) and (6.39) are 

p =°r(p r + i : j-) = (pr-i J y-) Or (6.42) 

This can also be verified directly. 

According to (5.13) operators P* and JL* (as well as P and J() 
are anticommutative. This is readily seen if we multiply (6.38) 
from the left by J(*, then multiply (6.39) from the right by Jf*, 
and add the two products. We then get 

jc'p * + = p r (jr*a 3 + (T 3 jr)=o (6.43) 

Finally, let us give the formula for P in curvilinear orthogonal 
coordinates . 1 If in coordinates q l ,q 2 ,q 3 the square of the element 
of arc length is 

ds 2 = h\ dq\ -f h\ dq\ + h\ dq\ (6.44) 

then 

p - f - ( Pl ~' Ts?7 (l °e *■*>>) + (ft - ‘ T 4r fl°8 Ml)) 

< 6 - 45 > 

where by p k we have denoted the operators 

p k = -ih-^~ (6.46) 

In the presence of an electromagnetic field we must change p k to 

= + (6.47) 

where the A k are the covariant components of the vector potential 
and satisfy the condition 

A x dx + A y dy + A z dz — A t dq : + A 2 dq 2 + A 3 dq 3 (6.48) 

We readily see that in the particular cases of cylindrical and 
spherical coordinates formula (6.45) transforms into (6.14) and 
(6.26) respectively. 


1 This was done for a more general case in the author’s work “The Dirac 
wave equation and Riemannian geometry” in J. Russian Phys.-Chem. Sot. 82: 
133 (1930). 
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7. An electron in a magnetic field 

Let us study the Pauli operator when there is a constant mag¬ 
netic field. For the sake of clarity we will use Cartesian rectan¬ 
gular coordinates. If the magnetic field is sufficiently weak, in H° 
we can drop the terms that contain the square of the vector po¬ 
tential. In the terms that are linear in the vector potential we 
substitute for A x , A y , A z the expressions 

A x = \{z26 y -yZ6 z ) 

A y = \{xW z -zM x ) 

A z =\{yZ6 x -xW y ) (7.1) 

This gives 

A x p x + Aypy -j- A z p z = Y ( m x<%x + tn y 26 y + m z 26 z ) (7.2) 

where m x , m y , m z are the components of the electron’s orbital 
angular momentum [see (1.1)]. 

Using (7.2), we obtain the approximation for H°: 

W== -L (£ + Pl + ft)- 6 ® 

+ 2mc~ (7-3) 

If to (7.3) we now add the spin dependent terms [see (5.19)], we 
get 

/f ~Hr («+'’*.+'*i)-«® 

+ [( m x + h°x) 9&x + ( m y + h a y) + (rn z + to z ) 26 z \ (7.4) 

This expression contains the scalar product of the magnetic 
field by the vector of the electron’s magnetic moment 

!%T("ix + to x ) 

+to y ) 

Vz == !h~( m * + tla Z ) ( 7 - 5 ) 

This vector has two parts: the orbital and the spin part. The 
first is proportional to the electron’s orbital angular momentum 

m x , m y , m z (7.6) 
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whereas the second is proportional to the intrinsic angular mo¬ 
mentum (spin) 

~2 ( 7 - 7 ) 

The ratio of the magnetic moment to the mechanical angular 
momentum of .an electron, or the gyromagnetic ratio, associated 
with spin is twice the gyromagnetic ratio for orbital motion. This 
is sometimes called the “double magnetism” of electron spin. 

In the central-field problem the correction term of H* in (7.4), 
which depends on the magnetic field, commutes with the principal 
term, (5.7). Hence the correction to the energy level, due to the 
magnetic field, consists in adding an eigenvalue of the correction 
term in (7.4). If we choose the z axis directed along the field, the 
correction will be 

where hm' is an eigenvalue of m z . 

The correction A E due to electron spin does not, however, bring 
any new levels into the picture since m' is an integer. Thus 
changing m' to m' ± 1 results in nothing. It is the relativistic 
effects that here play an important role. 

The Pauli operator, H*, does not account for these corrections 
[see formula (7.4)]. If we include them, then even in a central 
field the radial equation contains not only the quantum number / 
of Schrodinger’s theory but the quantum number k. The last is 
determined from the eigenvalue equation for spherical harmonics 
with spin. 

Jtty — khty (7.9) 

[see (1.22)], and is connected with / in the following manner: 

k(k- l) = /(/+ 1) (7.10) 

[see (1.20)]. 

We know that at / = 0 there is only one value of k, and that is 
k — 1. But at / = 1, 2,... two values of k are possible, k = l + 1 
and k = — l. As a result Schrodinger’s energy level corresponding 
to a given value of / (and a given value of n, the principal quan¬ 
tum number) at / ;> 1 splits into two adjacent components, which 
form a doublet. This is customarily called the relativistic doublet. 

In the radial equation the order of magnitude of the relativistic 
term, as compared with that of the principal term (the potential 
energy), is usually denoted by y 2 . where 



(7.11) 
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is a dimensionless constant, called the fine-structure constant. In 
contrast, the influence of a magnetic field on an energy level is 
given by (7.8). 

The splitting of energy levels in a magnetic field is called the 
Zeeman effect. 

The exact theory of the Zeeman effect for the hydrogen atom will 
be dealt with at the end of this book on the basis of Dirac’s theory 
of the electron. Here we only want to point out that the behaviour 
of an electron in a magnetic field proves convincingly that the 
electron has a new degree of freedom associated with spin. 

The existence of this new degree of freedom plays an important 
role in the quantum theory of many-electron systems (atoms and 
molecules are examples). This theory cannot be formulated with¬ 
out accounting for the symmetry of the wave function with 
respect to ail possible permutations of the electrons. The wave 
function, expressed in terms of the whole set of variables (x, y, z, a) 
for each electron, must reverse its sign under interchange of any 
two sets of variables corresponding to two electrons. This con¬ 
dition is called the Pauli exclusion principle, or the antisymmetric 
requirement for the wave function. We must note here that the 
set of variables for each electron includes the spin variable, o, 
in addition to the position variables. It is then evident that the 
need for spin arises already in the nonrelativistic theory. 

The many-electron problem will be studied in Part IV. 




Part IV 


THE MANY-ELECTRON PROBLEM 
OF QUANTUM MECHANICS 
AND THE STRUCTURE OF ATOMS 


1. Symmetry properties of the wave function 

In the previous chapters we studied wave functions for the 
states of one electron. If the state is stationary, the wave function 
must satisfy the Schrodinger equation. When we are dealing with 
states of a system of n electrons, the wave function must possess 
certain properties of symmetry (it must be antisymmetric) under 
interchange of the position and spin variables of the electrons. 
This requirement is called, as we know, the Pauli exclusion prin¬ 
ciple. More than that, in many cases the total spin (or the total 
intrinsic angular momentum) of the system of electrons is known. 
This results in an additional restriction on the wave function. 

As we know, a one-electron wave function depends on the three 
coordinates x, y, z and the spin variable a, which takes on two 
values only (for instance, a == +1 and a — —1). If we denote 
the set of the three coordinates by r, we can write a one-electron 
wave function as 

$(x, y, z, o)&=\b(r, o) (1.1) 

Now we turn to the wave function of a system of electrons. The 
wave function depends on all the spatial and spin variables of 
the electrons. For n electrons we have 

^ ^ (>i. of. r 2 , or 2 ; ...; r„, o„) (1.2) 

It is often convenient to denote by x t all the variables referring 
to the t'th electron (that is, the spatial and spin variables). Then 
the wave function for an n-electron system will be 

= x 2 . x n ) (1.3) 

According to the Pauli exclusion principle, the wave function 
must be antisymmetric with respect to the variables x\, x 2 ,..., x n , 
that is, it must change sign under interchange of any two vari¬ 
ables. For instance 

^ (*2> *1, *3 . X n ) = — $ (*!, *2, * 3 . • • • > Xn) (1 -4) 


17—2186 
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What are the requirements for a system of n electrons to have 
a definite value of total spin? First, we recall the main properties 
of spin studied in Part III. 

When there is only one electron, any operator acting on the 
spin variable can be represented as a linear combination of the 
three operators a x , o y , a z . These are defined by the following 
relationships: 

o) = ij> (r, — a) 

0 Fj,i|>(r, o) = — ioty(r, — a) 

o z ty(r, a) = CT\j)(r, or) (1.5) 

If we consider to be a two component wave function, the first 
component being rp(r, +1) and the second t|)(r, — 1), the operators 
Ox, o y , o z act in the same way as the Pauli matrices 


/0 1\ 

/0 -i\ (\ 

o\ 

a ^li o) 

■ °y=[i oj’ 

-i) < L6 > 

The operators 

~2 Sy~' 2 " ®y' s 2 = ~2 


s x — 

(1.7) 

satisfy the commutation relations 



Sy$z Sz^y - tSx 

$z$x ^X^z —’ ISy 

s x s y s y s x == i s z 

(1.8) 


which are characteristic of angular momentum in general (in 
units of h). Operators (1.7) can therefore be interpreted as the 
operators of the components of the electron’s intrinsic angular 
momentum. For a many-electron system we can in the same way, 
(1.5), define operators oi x ,oi y ,oi z that act on the spin variable of 
the Ith electron. We have 

o tx \ 1) = (r t Oi; r 2 , o 2 ; ...; r„ — o t ;...) 

o ly ty = — iorf (r u or, r 2 , o 2 ; ...; r h — o,; ...) 

<^ = 0^(0, <*, \r 2 , o 2 ; ...; r t , o t \ ...) (1.9) 

The operators for the components of total spin angular momentum 
(in units of h) can then be defined by analogy with (1.7) in the 
following way: 

S x — \ (O'ijc + a 2 X + • • • + a n X ) 

Sy— + ^2y + ••• + O ny ) 

s z — (o iz + 0 2z + ... + O nz ) 


(1.10) 
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These satisfy the same commutation relations (1.8). We also see 
that commutation relations (1.8) allow for an operator 

s 2 = s 2 + s 2 +s 2 (Ul) 

which is the square of total spin angular momentum. This last 
operator commutes with each of the operators s x , s y , s z , and its 
eigenvalues are s(s -f- 1), with s being half of a nonnegative inte¬ 
ger. If the number of electrons, n, is even, s is a positive integer 
or zero. But if n is odd, s takes on half-odd-integral values. In 
both cases the difference n/2 — s = k is a nonnegative integer; 
k can be interpreted as the number of electron pairs with com¬ 
pensated spin. 

For a given value of s the eigenvalues of s x , s y , s z run through 
a sequence of numbers 


— s, — s+1, .... s—1, s (112) 

that is, 2s + 1 values in all. 

We can represent operator s 2 in the following way: 

s2==tt _ «L + Y'Pi, (1.13) 

i<l 

where by Py we have denoted a permutation of the spin variables 
oi and or/. 

We can now formulate the condition for a system of n electrons 
to have a given ealue of total spin in the form of an equation: 

s 2 ij) = s(s-f- l)ij) (1-14) 

What are the eigenfunctions of this equation? To determine these 
we put k = ti/2 — s and let ai, 0 C 2 , , a* be a set of k different 

numbers from the sequence 1, 2. n. We introduce a function 

of the spin variables 

Fa l a i ...a k — F (Oafiat ••• Oa k I ®a fe+1 O r a ft+J ... 0^) (1.15) 

symmetric both in the arguments or ai , or 02 , ..., cr aft that stand to 
the left of the vertical bar and in the arguments cr aft+i , cfa k ^ 2 , • ■ • 
<r an that stand to the right. We also introduce a set of(jj) 
functions of the position coordinates of all the electrons, 

'l>a 1 a J ...a ft ==^(ri,r 2 , ..., r„) (1.16) 

which do not include spin variables as arguments. Functions 
(1.16) and also functions (1.15) are symmetric in the labels 


17 * 
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au a 2 , ..., a*. We can now build the series 
<t\\ r 2 , or 2 ; ..r„, a n ) 


= Z 


■ 




(f"li fit • • * t f n) F OjOj ••• Oj (®l> ®2» • • • i 0n) 


(o,... a k ) 

(1.17) 

On the basis of (1.13) we can show that the series (1.17) sat¬ 
isfies Eq. (1.14) with a value of s equal to n/2 — k, provided that 
the functions of position coordinates, (1.16), are restricted by the 
subsidiary conditions 


Z* 


... o* 


= 0 


(1.18) 


where a runs through all values of the sequence 1, 2. n 

except a 2 , .... a*. The number of such conditions is ( ). 

For the obtained eigenfunction of s 2 to be a possible physical 
state of the system of electrons with a given total spin we must 
see whether it satisfies the Pauli exclusion principle, that is, 
whether it is antisymmetric in xi. This is so if we express all func¬ 
tions (1.16) via the formula 

^OjUj ... a k (fit fit'**fnf 

== ® (P) ^ (f(U > fa t t • • *» fa k | r a *+a> • • • t f a„) (1.19) 

in terms of one function of the electrons’ position coordinates, 
namely 

ij) as ij) (r|, r 2 , • • • t fk 1 fk+\t f ft+ 2 » * • *»f n) (1.20) 


In (1.19) the labels ai, a 2 , ..., a« are the numbers 1, 
taken in an arbitrary order, and P is the permutation 


P 



1 

a. 


2 ... 

<*2 ... 


n 


) 


2 , 


.... n 

( 1 . 21 ) 


which replaces 1 by ai, 2 by a 2 , etc. By e(P) we will denote a 
number equal to +1 if permutation P is even and —1 if it is odd. 
What condition must function (1.20) satisfy? It must 

(1) be antisymmetric in the first k arguments [which in (1.20) 
stand to the left of the bar], that is, for instance 

{fit fit fzt • • •» fk I fk+lt • • • > fn) 

♦ (ri, fi . r 3 . fk 1 fk+i . fn) (1.22) 

(2) be antisymmetric in the last n — k arguments [to the right 
of the bar in (1.20)], that is, for instance 

^l?(^l» f2t •••ifk\f ft+2> f ft+li ^ft+3> •••> fn) 

= 'fr (^”l> fit • • •, fk I ^ft+l> fk+ii fk+ 3* • • •» fn) (1*23) 





The Many-Eledron Problem 


261 


(3) possess the property of cyclic symmetry, which is expressed 
by the equality 

t)> (rj, . . . > fk—l’ Tk I ^ft+2> • • • < ^n) 

— ♦ ( r l» • • •» r k-U r k+l\ r k< r k+2> ■••»^n) + 

• • • + 1 l ) ( r i» • • •,r*_t, r k+ i | r k+{ , .. .,r k+l _ lt r k , r k+l+ \, ..r„) + 

... + + (ri, .. .,r k _i, r n \r k+u ...,r n _i,r k ) (1.24) 

The right-hand side consists of n — k terms. Each of these is 
obtained from the left-hand side by successively interchanging 
the argument r*, lying to the left of the bar. 

The cyclic symmetry is the result of the subsidiary conditions 
(1.18). Each of the (*2i) conditions (1.18) corresponds to an 
equality of type (1.24). This can be verified by a direct compu¬ 
tation, in which we must allow for the properties of antisymmetry 
(1.22) and (1.23). Suppose that by P we denote a cyclic permu¬ 
tation of the set of arguments r ft , r k + 1 ,..., r„, that is, a permu¬ 
tation in which each member of the set is replaced by the next 
member, the last member taking the position of the first. Then 
(1.24) can be rewritten as 

(l +P + P 2 + ... +P n ~ k )y = 0 (1.25) 

for even values of n — k, and 

(l -P + P 2 - ... -P n - k )$ = Q (1.26) 

for odd values. 

In the particular case of two electrons the state with total spin 
zero (n = 2, k = 1) is described by a symmetric function of 
coordinates, and a state with total spin one (n = 2, k = 0) by an 
antisymmetric function. 

A very important example of a function of n arguments 
T\, r% . .., r„ that satisfies the three symmetry conditions just for¬ 
mulated is the product of two determinants 

* = (1.27) 

where 

♦ifri) ... q>,(r*) 

1P<1> .. 

♦* M ( r k ) 

ti (rk+i) ... (r„) 

W< 2 >=. (1.28) 

$n-k(rk+i) ... %.-* (r n ) 

These determinants use the one-electron functions 

%(/■), ..., %- k (r) 


(1.29) 
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that depend on position coordinates only. The larger of the two 
determinants contains all of the n — k functions (1.29), whereas 
the smaller contains only the first k functions. 

With this reasoning we are able to express the wave function 
(1.2), which depends on the spin variables in addition to the 
position coordinates, in terms of the Schrodinger wave function 
(1.20), which depends only on the coordinates. In the process we 
strictly take into account the Pauli exclusion principle and the 
eigenvalue equation for spin angular momentum, (1.14). 

Though the Schrodinger wave function does not depend on 
spin variables, its properties do depend on the value of total 
spin, since total spin affects the symmetry of the wave function. 

This explains what at first glance seems a paradox, that neither 
the Schrodinger equation nor the wave function contains spin 
variables and yet the energy levels depend on the values of total 
spin. This paradox can be solved in the following way. There are 
additional symmetry conditions imposed on the wave function 
that corresponds to an energy level with given spin. And these 
conditions differ for different values of total spin. 

2. The Hamiltonian and its symmetry 

The wave function that describes a stationary state of a many- 
electron system must be an eigenfunction of the appropriate Ha¬ 
miltonian. By analogy with classical mechanics the Hamiltonian 
can be written as 

n n 

V * + Z **) 

*-i k-\ 

+ y _ — _ (2 D 

k £L t K x k - *i ) 2 + (y k - y,Y + (** - *,)}' h 

Here Vi is the Laplacian operator that acts on the &th electron; 
U(x,y,z) is the potential energy of the external field, external in 
relation to the n electrons (for instance, the potential energy of 
the field produced by the nucleus of the atom or by the nuclei in 
the case of a molecule); and the double sum is the potential energy 
of interaction between the electrons. Hamiltonian (2.1) corres¬ 
ponds to the case without a magnetic field. If the system of 
electrons were in an external magnetic field, the Hamiltonian 
would contain terms that depend on spin. 

The energy levels and the stationary states of the system are 
determined from the equation 

Hty = Ety 


( 2 . 2 ) 
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where H is given by (2.1). We have already noted before that 
though H does not contain spin variables, the energy levels E 
depend on the quantum number s (spin angular momentum). The 
explanation is that the symmetry properties of the Schrodinger 
wave function, if, depend on the values of s. 

For an atom, H is spherically symmetric, that is, it does not 
change under rotations of the coordinate axes in space. We can 
then subject the Schrodinger wave function (a function of coor¬ 
dinates) to the condition that it simultaneously be the eigenfunc¬ 
tion of the operator of the square of orbital angular momentum 
(the quantum number /) and that of the operator of one of 
the components of angular momentum (the quantum number m). 
If, in addition, this eigenfunction possesses symmetry properties 
that correspond to a definite value of s, then it can be used to 
build a function of type (1.2). This last is a wave function with 
spin s. It satisfies the Pauli exclusion principle and is a simulta¬ 
neous eigenfunction of the following five operators: the Hamil¬ 
tonian, the square of orbital angular momentum, the square of 
spin angular momentum, the square of total (spin and orbital) 
angular momentum, and the component of total angular momen¬ 
tum along a coordinate axis. The function is built by using the 
so-called vector model, but we will not elaborate on this. 

The Hamiltonian of a diatomic molecule has not spherical but 
only axial symmetry (that is, it does not change under rotations 
about the axis that connects the two atoms). Axial symmetry too 
can be used to introduce quantum numbers and to partially de¬ 
termine the wave functions. 

If we use the spherical or axial symmetry that a system pos¬ 
sesses, we can introduce certain quantum numbers and thus clas¬ 
sify the energy levels. But considerations of symmetry are not 
sufficient to determine the levels themselves and the stationary 
states. An exact solution of Eq. (2.2) presents insurmountable 
mathematical difficulties (with the exception of the case of one 
electron). For this reason the development of approximate methods 
is most important. The most fruitful of these is the self-consistent 
field method, which we will consider next. 

3. The self-consistent field method 

We can obtain the eigenvalue equation for the Hamiltonian by 
applying the variational method, that is, requiring that 

61 ^ = 0 ( 3 . 1 ) 

with 

W = ^ ipHylp dV, N = ^ $i|> dV (3.2) 
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In these formulas we assume \J> to be the function of coordinates 
introduced in Section 1, that is, a function not dependent on spin 
variables. The volume element of configuration space, dV, is then 
the product of the differentials of all the coordinates of the 
electrons: 

dV = dx i dy\ dzi ... dx n dy n dz n (3.3) 

The normalization integral, N, can be considered a given constant. 

To prove our statement we find the variations of the integrals 
in (3.2). Since H is hermitian, 

6 ^ dV — ^ dV -f complex conjugate (3.4) 

Also 

6 ^ dV = ^ 6^ dV + complex conjugate (3.5) 

Then we multiply the second equality by a constant real quantity 
E, subtract the product from the first equality, and nullify the 
difference. We get 

^ 6(j) (H\ J> — E ij>) dV -j- complex conjugate = 0 (3.6) 

This must hold for an arbitrary variation of the real and imag¬ 
inary parts of ij>, which is only possible if the coefficient of 6ij> 
in the integrand is zero. Whence 

Hty = E$ (3.7) 

which is the eigenvalue equation for the Hamiltonian. This ends 
the proof. 

The physical meaning of W is that it is the mathematical expec¬ 
tation of the system’s energy in state t|>. The extremal value of W 
is the energy level E. To obtain the lowest level with a given val¬ 
ue of quantum number s we must, in varying the integral, choose 
for comparison only functions if that possess the “right” symmetry 
properties and satisfy some general conditions (the derivatives 
of $ must exist and all integrals must have finite values). If we 
want to obtain the higher levels, we must in addition require that 
the wave function corresponding to these levels be orthogonal to 
the wave functions of the lower levels. 

To simplify matters we can require that the wave function 
satisfy some additional conditions. For instance, we can choose 
the wave function in the form of a product of two determinants, 
which in turn consist of the one-electron wave functions [see 
(1.27) and (1.28)]. Of course, in this case we would obtain an 
energy level that lies somewhat higher than the true one. The 
difference, however, will be small. In the same manner we can 
obtain all the other energy levels (also approximately). 
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Let us calculate the result by substituting into W the product 
(1.28). We will assume that the functions are orthogonal: 


\ 'I’p (r) (0 dx = 6 pq , dx = dx dy dz 


(3.8) 


which is true without loss of generality. For this we express the 
Hamiltonian (2.1) as 


. r n ) = Y j H(r p )+ £ \t p -r q \ 


where 

Then 


p-i 


)»>«■= i 


n—k 


p-1 


(3.9) 

(3.10) 


^-z$ $ P (r)H(r)%(r)dx + Z$ %(r)H(r) % (r) dx 

i p-i 

. * 2 f f P (l> (r. r) p (1) (r ', r') - | p<‘> (r, r') l 2 ^ 
f 2 J) I r — r' | 

il ^ P (2) (r, r) p (2) (r\ r') - | p (2) (r, r') I 2 ^ 

+ ^SS p( ‘ >(r, |r-T - rfTrfT/ (3 - U) 


where we have introduced the notations 


P (1) (r, r') = Z 'I’p W 'I’p (rO (3-12) 

p-i 

n —k _ 

p< 2 > (r, r') = z $p (r) 't’p (r') (3.13) 

p- i 

The formulas obtained allow for a clear interpretation. First of 
all, we assume that the overall wave function can be expressed in 
terms of one-electron wave functions tj)p(r). This is equivalent to 
each electron having its own wave function (we can also say, its 
own orbit). The electrons in this case separate into two “clouds”. 
The first corresponds to electrons on orbits ty, i|j 2 , ..., $*, and 
the second to electrons on orbits \Jjj, t|> 2 , ..., t|j«—*. The two clouds 
have opposite spin angular momenta. Each orbit ^j, \j> 2 , ... ij>* 
“carries” two electrons with opposite spin angular momenta. The 
remaining orbits, ..., have one electron each, and the 
spin angular momenta of these electrons are the same. Thus for 
the first k orbits the total spin of each electron pair is zero. For 
the remaining n — 2k orbits the spin angular momenta add up. 
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Since the electron spin (its absolute value) is V 2 . the total spin of 
the system is n/2 — k — s, which is what we expected. We can 
interpret the e times p (1) (r, r), which is 

<?p<'>(r, r) = e £ | (r) | 2 (3.14) 

p-i 

as the space charge density of the first cloud [a similar interpre¬ 
tation can be given to ep (2 >(r, r)\ As for (3.12) and (3.13), they 
have different arguments r and r' and do not allow for a classical 
interpretation. We call them “mixed” charge densities. 

Let us now interpret the expression for the energy of the elec¬ 
tron system, (3.11). The first sum on the right is the kinetic and 
the potential energy of the first cloud in the field produced by the 
nuclei; the second sum is the same for the second cloud. The 
terms in first double integral that contain p (1) of the same argu¬ 
ments represent the electrostatic energy of the electrons of the 
first cloud. What remains in the first double integral contains the 
mixed charge density. It does not have a classical interpretation, 
and its presence in the energy is a specifically quantum effect (it 
is called the exchange interaction). The second double integral 
corresponds to the second electron cloud, and it can be interpreted 
in a similar manner. Finally, the third double integral corres¬ 
ponds to the electrostatic interaction between the two clouds. 

Our interpretation is not rigorous. However, it is pictorial and 
hence useful for understanding the physical meaning of our for¬ 
mulas. In the strict sense formula (3.11) is the result of substitut¬ 
ing a wave function having the “right” symmetry properties into 
the integral that is being varied. 

If we vary (3.11) under conditions (3.8), we can find the system 
of equations for the sought functions tj> p (r). The system is 


2 [H(r) + F(r)] (r) - e* J [ p(1) r) + fr', Ol % ^ ^ 

n—k 

= £vM r )> p= 1.2 . k 

9-1 

[H(r) + V (r)] ^ p (r) - e 2 \ % (O 

n—k 

= Yj ( r )> P = k +l, ...,n—k 


In both formulas 


F(r)==e 2 J 


P (1> (r' t r') + p< 2) (r', Q 

I r — r' | 


(3.15) 

(3.16) 

(3.17) 
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can be interpreted as e times the potential of all the electrons. The 
quantities X qp are the Lagrange undetermined multipliers, which 
correspond to the orthogonality conditions (3.8). These conditions 
must be taken into account when building the variation 6W. We 
can consider the off-diagonal elements of matrix X qp to be nonzero 
only if one of the labels is greater than or equal to k + 1 and the 
other is less than or equal to k. 

We note that Eq. (3.16) for \J) p (r) does not contain the function 
in its coefficients, so that if we consider all the other functions to 
be known, the equation for i|v(r) will be linear. 

We can formulate this property of the equation in the follow¬ 
ing way. We put 

P <2) (r, r') = X* 0- 6 m ) (r f ) (3.18) 

q-l 


and by V p (r) denote an expression similar to (3.17) but with sums 
(3.12) and (3.18) instead of (3.12) and (3.13). If we change V 
to V p and p (2> to p p 2) , Eq. (3.16) retains its form. 

Let us now turn to (3.15). We write this equation for the case 
when each orbit has two electrons, that is when s = 0. Then n 
is an even number, k = n/2, and the sums (3.12) and (3.13) 
coincide. Hence we can drop the upper index of p. More than that, 
we can assume in this case that X qp is a diagonal matrix and put 


X qp = 2E p 6 qp (3.19) 

As a result 

[H ( r ) + V (r)] a|) p (r) — e 2 jj t P (r') dY = E p % (r ) (3.20) 


where 

and 


n/2_ 

P (/•', r) = £ ^ (r') (r) 

<7-1 


If we then denote 

V p (r) = 2e 2 \ 


P (r\ r') 
|r-r'l 



U P (r')| ! 

i - i 


dx' 


(3.21) 
(3.21*) 

(3.22) 


and introduce a function similar to (3.18) 

P P (r', r) — E (1 — 6 P „) (r') ( r ) 

< 7-1 


we get Eq. (3.20) in the following form: 

[H (ir) + V p (r)] i|7 p (r) - e 2 J % (r') dx' = E p % (r) 
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Suppose that we drop the term with the integral. We then get the 
Schrodinger equation for an electron in a field with potential 
energy 


<D = U(r)+V p (r) 


(3.23) 


where U(r) is the potential energy of the external field (see 
(3.10)], and V P (r) is the potential energy of the field produced by 
all the electrons except the given electron [V p (r) calculated by 
(3.22) is proportional to a potential that corresponds to charge 
density p— |t|>p| 2 ]. As for the term with the integral, it has no 
classical counterpart. It is called the correction for quantum 
exchange. 

Without the integral, Eq. (3.20) (a somewhat less accurate 
equation, in fact) was first suggested by the English mathema¬ 
tician D. R. Hartree. However, he did not give a satisfactory justi¬ 
fication for such equations because in deriving them he used nei¬ 
ther the variational method nor the idea of a wave function of the 
system as a whole. He proceeded from the obvious considerations 
just mentioned. Hartree called such equations self-consistent field 
equations (in the sense that the potential V in the equation for 
the wave functions is itself expressed in terms of these functions). 
Subsequently they were named the Hartree equations. 

The complete equations with the integral terms, equations which 
take into account the symmetry of the system’s wave function for 
a given value of spin angular momentum, have been obtained here 
by using the variational method. In doing so we have confirmed 
the Hartree equations. The complete equations have come to be 
known as self-consistent field equations with quantum exchange. 
In current literature these are also referred to as the Hartree- 
Fock equations. 

The complete equations can be derived in another way, proposed 
by Dirac. This derivation differs in that the spin variables are 
included in the one-electron wave functions at the very outset 
(rather than excluded). The wave function (1.3) of a system of 
electrons is approximately expressed by one determinant of type 




^i(*i) . v tiW 

'M*l) ••• 'I’n (*n) 


(3.24) 


which contains the one-electron wave functions (1.1), called the 
orbitals. The resulting equations for these orbitals are similar to 
ours. What makes this method advantageous is the simple ma¬ 
thematics involved (since we have to do with one determinant 
instead of the product of two). The drawback is that Eq. (1.14) 
for the operator of spin angular momentum is not satisfied auto¬ 
matically but only after we have chosen the orbitals in a proper 
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way. When the problem possesses spherical symmetry, we can 
express the orbitals tyi(x) in the determinant (3.24) in terms of 
the radial functions R n i and the spherical harmonics with spin in 
such a way that the radial equations coincide with the ones we 
obtained by the previous method. 

The self-consistent field equations for the orbitals can also be 
obtained from the theory of second quantization. 


4. The equation for the valence electron 
and the operator of quantum exchange 


We consider a system of an odd number of electrons, n = 2k + 1, 
which has a spin equal to V 2 (for example, an atom with one va¬ 
lence electron). The overall wave function of such a system is a 
product of two determinants. It contains k wave functions 
'In, ^ 2 , .. •, which enter into both determinants, and one wave 
function t|>, which enters into the larger determinant alone. We 
can say that functions i|n, $ 2 , . • •, describe 2k inner-shell elec¬ 
trons with compensated spin (two on each orbit) and function tp 
describes the valence electron. 

Using formula (3.11), we can write the energy of this system 
in the form of a sum: 

W = W 0 + W' (4.1) 

where 


— 2 ♦p (0 H M M * 

+ «■$ $ 2p(,.,)pfrV0-|p(,./)|- izdx , (42) 


P-1 


is the energy of the inner-shell electrons, and 
W'=\^r)H (r) ^ (r) dx 

+ J , 2P (r / .r , )l'H(r)P-p^, r) +F) ♦ (O ^ ^ (4 3) 


is the energy of the valence electron of the field of the inner-shell 
electrons. The mixed charged density p(r, r') is 

P ( r , r') = £ %{r) % ( r ') (4.4) 

P-1 


If we vary W over all the wave functions simultaneously, we will 
again arrive at the self-consistent field equations (3.15) and 
(3.16). 

We can modify our problem and first determine the wave func¬ 
tions of the inner-shell electrons from the minimum condition 
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for U7 0 . Then, assuming that ijjj, i|) 2 , .... are given, we can 
determine the wave function of the outer electron, t|)(r), from the 
minimum condition for W or W (these two differ by the constant 
Wo). 

This modified problem amounts, from the physical point of view, 
to first determining the stationary state of a system that contains 
one electron less (the atomic core) and then finding the state of 
the valence electron, ignoring the polarization of the atomic core 
by this electron. 

For the wave functions of the inner-shell electrons we get a 
system of equations (3.20), and for the valence electron a linear 
integro-differential equation 

[H(r)+V (r)J * (r) - e 2 J dr' = £* (r) (4.5) 

where 

V(r) = 2e*\f£p [ dr' (4.6) 

The wave function of the valence electron is always assumed to 
be orthogonal to the wave functions of the inner-shell electrons: 

^ 't’p (r) ^ (r) dr = 0, p— 1,2. k (4.7) 

It is easy to see that p(r, r') and V(r), determined in (4.4) and 
(4.6), coincide with (3.21*) and (3.21), and Eq. (4.5) with 
Eq. (3.20). Hence all the wave functions of the inner-shell elec¬ 
trons satisfy the equation for the valence electron as well; all 
are the eigenfunctions of the same linear integro-differential 
operator that stands on the left side of (4.5). It also follows from 
this that the orthogonality conditions (4.7) are automatically met 
(that is, these conditions are a corollary of the equation). As for 
the parameters E p in (3.20), these are the eigenvalues of the same 
operator and can be interpreted as the energy levels of the inner- 
shell electrons. 

Let us introduce a linear integro-differential operator s&, defin¬ 
ing it as 

sH ( r) = e 2 $ |^T + < r ') dx ' ( 4 - 8 ) 

Recalling formula (3.10) for the Hamiltonian, H(r), we can write 
our basic equation (4.5) as 

— V 2 iJ) + [U ( r) + V (r)] i|) — stty = Ety (4.9) 

The operator —enters as a separate term into the expres¬ 
sion for the energy of the electron. It can therefore be interpreted 
as a special kind of energy, usually called the quantum exchange 
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energy. If we examine the way that we derived Eq. (3.13), we see 
that the term appears because we took into account the 
symmetry properties of the wave function. These properties as 
well as the Pauli exclusion principle are connected with the indis- 
tinguishability of electrons and the impossibility of following any 
electron when it interacts with other electrons (the impossibility 
of labeling an electron so as to spot it after the interaction). This 
is an impossibility characteristic of quantum mechanics but not 
of classical mechanics (where the concept of a path is considered 
universally applicable). For this reason it is difficult to find a 
pictorial interpretation and a name for si*. The accepted term 
“quantum exchange” comes from the idea that when electrons 
interact, they change places, so to say. 


5. The self-consistent field method 
in the theory of atoms 

The simplest of the many-electron systems is the atom. What 
makes it easier to apply the self-consistent field method in describ¬ 
ing various atoms is that even before quantum mechanics was 
fully established as we know it today (Schrodinger’s theory) Bohr 
depicted the structure of electron shells for all atoms in the 
Periodic Table. His approach makes it possible to assign certain 
quantum numbers to each electron in the atom. These numbers 
are similar to those that characterize the state of a separate 
electron in a central field in the one-body problem. Bohr based his 
approach on experimental data, namely, the analysis of the spectra 
and chemical properties of atoms. Bohr’s picture was confirmed 
theoretically by Schrodinger’s quantum mechanics and the devel¬ 
opment of approximate methods, notably the self-consistent field 
method. 

In quantum mechanics the possibility of ascribing specific 
quantum numbers to each electron in the atom implies the possi¬ 
bility of ascribing a specific wave function to each one. This is the 
very assumption that lies at the basis of the self-consistent field 
method, where the overall wave function is expressed in terms of 
one-electron wave functions. Thus we can determine the general 
character of the wave function if we know the quantum numbers 
of a given electron. 

According to Bohr’s approach the electrons in the atom separate 
into groups of equivalent electrons. Each such group is char¬ 
acterized by two quantum numbers: the principal quantum num¬ 
ber n and the azimuthal quantum number /, where n = 1, 2, 3,... 
and l — 0, 1, ..., n — 1. Inside each group the electrons can be 
characterized by two other quantum numbers: the magnetic 
quantum number m and the spin Quantum number m s , where 
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m = — l, —/ + 1, . .., l — 1, / and m s = ±'/ 2 . In the theory 
elaborated in Section 3 we took into account two values of the 
spin quantum number by separating the electrons into two clouds. 
Hence to characterize the wave function of an electron with given 
n and l it is sufficient to specify the value of the magnetic quan¬ 
tum number m. 

If we denote the spherical coordinates with the origin in the 
nucleus by r, 0, qp, then the wave function of an electron with 
quantum numbers n, l, m will be 

= (5.1) 

where Yt m is a normalized spherical harmonic: 

^ |K im (0, (p)| 2 sin0d0 d<p = 4n (5.2) 

We must note that the choice of the quantum numbers that serve 
to separate the electrons inside a group is somewhat arbitrary 
because we can choose any direction as the polar axis. This 
arbitrariness is not evident, however, when a group of equivalent 
electrons is completely filled, that is, when the electrons in the 
group have all possible values of m. We say then that we have a 
closed electron shell. 

Let us write the expression for the mixed charged density of 
the electrons of one cloud that belong to a closed electron shell. 
By (4.4) we have 

P n i (r> r') = £ y ntm (r, 0, qp) % !m (r0', <p') (5.3) 

m--/ 

Substituting from (5.1) and using the addition theorem for 
spherical harmonics, we get 

P nl ( r > O = ~4 T 1 R nl ( f ) R nl ('') P ‘ ( C0S Y) (5-4) 

where 

cos v = cos 0 cos 0' + sin 0 sin 0' cos (qp — qp') (5.5) 

and Pi is a Legendre polynomial. Since y is the angle between the 
directions (0, qp) and (0', qp'), it does not depend on the choice of 
the polar axis. Formula (5.5) prompts the conclusion that a closed 
electron shell is spherically symmetric. 

The number of electrons of one cloud that belong to a closed 
shell as 21+1. Hence the total number of electrons of such a 
shell is 4 1 + 2. 

According to Bohr’s picture most of the electrons in an atom 
make up closed shells. The rest, called the outer electrons, are 
situated in unfilled shells. All elements have only one unfilled 
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shell, with the exception of the rare earths and a few other ele¬ 
ments, which have two. 

The simplest case is when there is only one (valence) electron 
outside the closed shells. An example of such an atom is the 
sodium atom. 

When studying problems connected with the atom, it is con¬ 
venient to use a “natural” system of units (atomic units) intro¬ 
duced by Hartree [see Section 2, Chapter V, Part II]. In this 
system the electron charge, the electron mass, and Planck’s con¬ 
stant h divided by 2it are put equal to unity. The atomic unit of 
length will be equal to 0.529 angstrom, the atomic unit of speed 
is the 137th part of the speed of light, and the atomic unit of 
energy will be twice the ground-state energy of the hydrogen 
atom, that is, 27.21 eV (1 hartree). 

The Hamiltonian for sodium in atomic units is 

= (5.6) 

since the charge of the nucleus of sodium is equal to 11 units. 
The inner-shell electrons of a sodium atom are grouped in three 
closed shells. We denote the corresponding wave functions as 

='I’lOO. % ='l>200, 'i>3 : ='1*21-1. 'I>4 = '1*210. 'fe ='l>2ll (5.7) 

where the 4|>„ /m are given by (5.1). The mixed charge density 
of all three shells will be 

P (r, r') = — [/? 10 (r) /?, 0 (r') 

+ *20 (r) R 20 (r') + 3/? 2 i (r) /?„ (r') cos Y ] (5.8) 

Substituting this into the expression for W 0 , (4.2), we get the 
energy of an ionized sodium atom. In the expression obtained 
for W 0 we can integrate over all angles. The expression will then 
depend only on the radial functions R 10 , R 20 , and R 2 1 . 

Now we can proceed in two ways, numerically or analytically. 
To apply the numerical method we must build equations involving 
variations of the radial functions. These equations have a form 
similar to (3.20). The numerical solution of variational equations 
is based on the method of successive approximations. We can then 
compile tables of radial functions with any desired degree of 
precision. To apply this method successfully it is important to 
choose a “reasonable” initial trial function. This is best done 
analytically. For this we look for R ]0 , R 20 , and R 2 1 in the form of 
analytic functions that depend on a small number of variational 
parameters. For instance, we can put 

Rio = ae~ ar , i? 2 o = 6 (1 --[(o + Kr), /? 2l = cre-v' (5.9) 


18—2186 
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where a, b, and c are found from the normalization condition 

oo 

$ [Rm(r)] 2 r*dr=l (5.10) 

o 

and are expressed in terms of a, p, and y. If we substitute (5.9) 
into the expression for W 0 , we find W 0 as a linear-fractional func¬ 
tion of a, p, and By equating with zero the derivatives of Wq 
with respect to parameters a, p, y we arrive at an equation for 
a, p, y. After solving these we get 

a= 10.68, p = 4.22, y = 3.49 (5.11) 

with the following radial functions (in analytic form): 

7? IO (r) = 69.804e- 10 ^ 

R 20 (r ) = 13.602 (1 - 4.967r) e~ 4 Z2r 

R 2l (r) = 26.276re -3 - 49r (5.12) 

If we take these as the initial trial functions, we can get still 
better approximations from the variational equations. Functions 
(5.12) correspond to a definite value of the energy of an ionized 
sodium atom, namely, to W 0 — —160.9 hartrees, whereas the more 
exact functions, found by numerical solution of the equations, 
yield W 0 = — 161.8 hartrees. 

By finding the functions R w , R 20 , and R 21 we have, in the given 
approximation, fully determined the electron-shell structure for 
sodium. We can now build the integro-differential equation (4.5) 
for the wave function of the valence electron. This function will 
be of type (5.1), where R,a(r) is found by solving numerically a 
certain equation, which we will not elaborate on here. 

Solution of Eq. (4.5) first of all gives the optical terms of 
sodium. It also gives a rough estimate of the X-ray terms. To 
exhibit the accuracy of our method we offer the following table: 



O 

m 

£21 

MM 

£31 

Theory 

Experiment 

-40.6 

-39.4 


CO Tt* 

CO p 

TT 

m 

-0.1094 

-0.1115 


All figures are given in hartrees. We see that the optical terms 
(£30 and £31) are found with a fair degree of accuracy. For 
instance, the error in £31 amounts to only 1.9 precent. For com¬ 
parison, if in Eq. (4.5) we neglect the term with the integral, the 
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value of £31 is —0.08895, which corresponds to an error of 
22.4 percent. 

Apart from the optical terms our integro-differential equation 
makes it possible to find the wave functions for the valence elec¬ 
tron on different energy levels. This enables us in turn to find 
the transition probabilities. For sodium and lithium the result 
agrees qualitatively with experiments. The result for lithium is 
the characteristic nonmonotonic dependence of the probability of 
the transition E n \ £20 on quantum number n. 

These results show that we can formulate the theory of the 
valence electron in an atom, with a high degree of accuracy, as 
the problem of a single body in a given field provided that we 
take into account the quantum exchange by introducing the 
integro-differential equation (4.5). 

One of the ways in which we can use our equation has to do 
with what is known as the sum rule. This rule refers to the oscil¬ 
lator strengths, which are quantities proportional (a) to the 
square of the matrix element of the given transition and (b) to 
the difference in the two energy levels involved in this transition. 
The rule is usually stated in this way: 

The sum of the oscillator strengths corresponding to all optical 

transitions in a given series must be unity. 

This is an exact rule for atoms with one electron (hydrogen¬ 
like atoms, or ions with no electrons except one). As for atoms 
with a single valence electron, experiment has shown that in 
some cases (lithium, thallium and other atoms) the very first 
“oscillators” yield a sum greater than unity. Our theory explains 
this by noting that Eq. (4.5) has a complete set of eigenfunctions, 
and some of these functions correspond to X-ray terms. Hence, 
when building the sum of the oscillator strengths, we must account 
for the fictitious oscillators that correspond to transitions to 
occupied X-ray terms, which lie below the optical terms. 

Since for these oscillators the difference in the energy levels 
that enters into the oscillator strength is negative, the oscillator 
strength itself will be negative. Thus in the expression for the 
total sum there will be a (finite) number of negative terms. It 
is clear from this that if the total sum is unity, the sum of the 
positive terms, which corresponds to the observed optical tran¬ 
sitions, will exceed unity. We must also bear in mind that the 
total sum is unity only if we neglect certain small corrections 
due to quantum exchange. 

Our equations in relation to atoms with a single valence elec¬ 
tron are mainly applied in calculating energy levels and tran¬ 
sition intensities. But attempts have also been made to use these 

18 * 
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equations to account for some relativistic effects, namely, the 
distance between the two terms of a doublet (doublet splitting). 

Because of the difficulties involved in determining the wave 
function of the valence electron over small distances from the 
nucleus these attempts failed to give good quantitative results. 
For the doublet splitting of the spectral terms of sodium and 
lithium they only made it possible to estimate the order of mag¬ 
nitude. (We note that earlier calculations involving rougher 
approximations of the wave function did not even give the correct 
order of magnitude.) But they did bring out how important a role 
is played in the formula for doublet splitting by the terms 
responsible for the quantum exchange. It well may be that these 
terms explain the observed negative value of the doublet splitting 
for some atoms. In all likelihood, however, it will not be sufficient 
to consider the valence electron as being in the static field of the 
inner-shell electrons if we are to build an exact theory of doublet 
splitting. 

6. The symmetry of the Hamiltonian 
of a hydrogen-like atom 

In the previous section, following Bohr, we characterized each 
atomic shell by two quantum numbers, n and l. For a given value 
of n the number l takes on the values l = 0, 1, 2, ..., n— 1, 
altogether n values. All n electron shells belonging to a given 
value of n form the so-called “big” shell. (In fact, we often speak 
of electron shells and subshells. There is one shell for each prin¬ 
cipal quantum number n, and there is one subshell defined for 
each value of l allowed by / = 0, 1, 2, ..., n — 1. Thus each shell 
consists of a number of subshells.) This big shell possesses great 
stability in the atom. In the monovalence atoms of lithium, sodium, 
and copper there are one, two, and three big shells that are 
completely filled (closed). It is most convenient to describe a big 
shell with the help of hydrogenic wave functions corresponding to 
a certain effective charge of the nucleus. This charge can be de¬ 
termined using the variational method. 

Let us denote the true charge of the nucleus by Z and the 
effective charge for a big electron shell with quantum number n 
by Z n . It is more convenient to deal with the quantity p n — Z„/n, 
which is the root-mean-square of the momentum of an electron in 
the nth big shell (in atomic units), rather than with the effective 
charge Z„. 

Describing atomic shells with the help of (analytic) hydrogenic 
functions makes it possible to find simple formulas for the various 
functions that characterize the properties of an atom. For instance, 
the momentum distribution function for the electrons in the nth 
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shell, normalized according to the condition 

oo 


appears to be 


4 n J Pn (p) P 2 dp — n 2 


P»(P) = 


Spl"? 


« 2 (^ + P 2 ) 4 


( 6 . 1 ) 

( 6 . 2 ) 


The theory of hydrogenic functions can be applied not only to 
the theory of the electron-shell structure of atoms but also to the 
Compton scattering on bound electrons and other similar problems 
dealing with wave functions belonging to a continuous spectrum. 
For the hydrogen atom we have, in atomic units, 

n = (-2E)-' l ‘ (6.3) 

In the continuous spectrum E is positive and (6.3) is pure imag¬ 
inary. This, however, does not exclude the possibility of using 
the theory of hydrogenic functions belonging to a discrete 
spectrum for the case when the relationships obtained with their 
help are formulated in terms of analytic functions of n. We can 
then make a formal transition in these relationships from the 
discrete spectrum to the continuous, giving n pure imaginary 
values. For this reason we will confine ourselves to a discrete 
spectrum. 

To write the Schrodinger equation in momentum space for 
a hydrogen-like atom we must first determine what corresponds to 
the operator of multiplication into 1/r in momentum space. The 
wave functions in coordinate space and momentum space are 
connected by the relation 

'I 5 (Px, P y , Pz) = (2n ^v, J e~ i( ~ xp * +yp v +zp *')l \p (x, y, z) dx dy dz (6.4) 


We must find the form of the operator L that transforms func¬ 
tion t|> into function Lij>, which can be represented as 

I* </>,. P„ P.) - ^ ( «-><»,*»>,«'’,>/> lUdUL dx d y dz (6 .5) 

But 

_L_ C c - l ( x '>'x+VPu+zPz)l' t d P r _ 1 r -‘<*Pi+yP v +zPz) /* / fi C) 

2n 2 ft J c | p - p' | l r e 

where dp' — dp' x dp' y dp' z is the volume element in momentum 
space. Hence the operator that in coordinate space appears as 
multiplication into 1/r transforms in momentum space into the 
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integral operator 

(Px, Py , Pz) = 2^W \ TT=W d P' < 6 - 7 ) 

Thus the Schrodinger equation in a Coulomb field with po¬ 
tential energy — Ze 2 /r becomes in momentum space an integral 
equation of type 

i <P) = “ W S Tp-W dp ' (6 ‘ 8) 

Since we are dealing with the discrete spectrum, for which energy 
E is negative, we can introduce the root-mean-square momentum 

po = (— 2 mE)' 1 ' (6.9) 

We will consider the components of momentum divided by p 0 
as the rectangular coordinates on a hyperplane corresponding to 
the stereographic projection of a sphere of unit radius in four¬ 
dimensional Euclidean space. The rectangular coordinates of a 
point on the sphere are 

| = ■ 2 J’ oPx o = sin a sin 9 cos <p 

Po + P 

ZPoPy . . a . 

t) = —-n“ = sin a sin 0 sin <p 

Po+P 

£ _... 2paPz - = sin a cos 0 

P 2 o + P 2 

2 2 

x = -^-r^ =cosa ( 6 . 10 ) 

Po + p 

where 

S 2 + ti 2 + S 2 + X 2 =1 (6.11) 

The angles a, 0, and cp are the spherical coordinates on the 
hypersphere. At the same time angles 0 and <p are the angles 
that characterize the direction of momentum in three-dimensional 
space. The element of area on the hypersphere is 

dQ = sin 2 a da sin 0 dQ cfcp (6.12) 

and the total surface area of the hypersphere is 2n 2 . Instead of 
ij)(p) we introduce the function 


¥ (a, 0. «P) = Po' h (pi + P 2 ) 2 (P) 


( 6 . 13 ) 
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for which the normalization condition is 

Arjma.e, <p)| 2 dft=5-^ i ^K(p) | 2 dp 

= ^ 1t (P) I 2 = 1 

If for the sake of brevity we put 

^ _ Zme 2 _ Zme 2 

tipo ti (— 2mE)'!* 


(6.14) 

(6.15) 


and go on to the new variables, the Schrodinger equation (6.8) 
becomes 


T<«, *,t) = 4r\ 


W ( a6', q>') 
4 sin 2 (co/2) 


dQ' 


(6.16) 


Here 2 sin (co/2) is the length of the chord and w is the arc 
length of the great circle that connects the points a, 0, <p and 
a', 0', cp' on the four-dimensional sphere, so that 


4 sin 2 = 1') 2 + (n - q') 2 + (£ - O 2 + (x ~ x') 2 (6.17) 


Equation (6.16) is the integral equation for the spherical harmo¬ 
nics of a four-dimensional sphere. The eigenvalues will be the 
integers X — n (n = 1, 2, ...), and the eigenfunctions will be 
the homogeneous harmonic functions of degree n — 1 in its 
arguments g, q, £, x, that is, functions of type 

V = a(g, ri, £, x) (6.18) 


where u(x l} x 2 , x 3 , x 4 ) is the solution to the four-dimensional 
Laplace equation 


d 2 u . d 2 u , d 2 u . d 2 u 
dx 2 dx\ dxl dx\ 


(6.19) 


As we see from (6.15) the integer n is the principal quantum 
number. 

Thus the theory of the hydrogen atom is connected with the 
four-dimensional potential theory. This interrelation makes it 
possible to easily derive the properties of hydrogenic wave func¬ 
tions and, notably, establish the addition theorem for these func¬ 
tions. This theorem holds not only for real integral values of n 
(the discrete spectrum) but also for complex values (the con¬ 
tinuous spectrum). 

The most significant corollary of this interrelation is the de¬ 
termination of the symmetry transformation group allowed by the 
Schrodinger equation for the hydrogen atom. Equation (6.17) 
obviously retains its form under an orthogonal transformation of 
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the variables g, tj, £, %, that is, if the hypersphere is rotated arbi¬ 
trarily in four-dimensional space. It follows from this that the 
initial Schrodinger equation not only possesses the usual spher¬ 
ical symmetry but also has a wider symmetry corresponding to 
four-dimensional rotations. This explains the long-known fact that 
the energy levels of hydrogen depend only on the principal quan¬ 
tum number n. Application of a wider transformation group to 
the Schrodinger equation ensures the results mentioned at the 
beginning of this section. We will not detail these results or their 
derivation. 



Part V 


DIRAC S THEORY OF THE ELECTRON 


Chapter I 


THE DIRAC EQUATION 

1. Quantum mechanics and the theory of relativity 

The theories of both Schrodinger and Pauli are nonrelativistic. 
They ignore the fact that no mass point can move in space and no 
action can propagate in space at a velocity that exceeds the 
speed of light. A relativistic generalization of quantum mechanics 
requires introducing new physical concepts and even modifying 
the interpretation of the wave equation. The modification is nec¬ 
essary because we need to introduce, besides spin, a new degree 
of freedom for the electron and because we cannot interpret this 
degree of freedom within the limits of the one-body problem. 

However, it is possible to formulate the problem of one body 
(the electron) in a given external electromagnetic field in accord¬ 
ance with the theory of relativity. Dirac formulated this problem 
when he suggested his equation for the electron. 

In Section 13, Chapter III, Part I, we saw that the wave 
equation, that is, the equation that determines the law governing 
the time dependence of the electronic state (function ij>), must be 

H$-ih^- = 0 ( 1 . 1 ) 

where H is the Hamiltonian. The wave equation Is closely related 
to the quantum equations of motion, from which we derived the 
wave equation (Section 13, Chapter III, Part I). The quantum 
equations in turn can be derived from the wave equation (Sec¬ 
tion 4, Chapter IV, Part I). Now, following Dirac, we must gen¬ 
eralize the wave equation (1.1) to the theory of relativity. We 
must require that it be invariant under Lorentz trasnformations 
and that it give us the classical equations of motion of the theory 
of relativity. 

2. Classical equations of motion 

Let us recall the form of the classical equations of motion of 
the theory of relativity and the corresponding Lagrangian and 
Hamiltonian functions. 
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In the mechanics of the theory of relativity the momentum 
( P x , P y , P z ) is connected with the velocity (x, y, z) by the rela¬ 
tionships 

p mx p _ my p mi (c) 

x ~ d-v'lc^’ y ~ (\ — o*/c*)V« ' 2 (1 -S/c 2 )' 1 ’ K ' 

where 

v 2 = x 2 + y 2 + z 2 (2.1*) 

The equations of the motion of an electron (mass m and electric 
charge — e) in an electromagnetic field have the form 

^f- = -±^-z^ y )-eS x 

dP u e 

-JL=--(zW x -xW z )-e8 y 


«£- = -JL { xM y -yW x )-e8 2 

From these we can easily derive the equation 

-fr =*-e(*8 x + y8 y + z8 s ) 

where T is the electron’s kinetic energy: 

j, me 2 

~ (l - v 2 /c 2 )‘ h 


( 2 . 2 ) 

(2.3) 

(2-4) 


These equations can be obtained from the Lagrangian 


9? — — me 2 (1 — v 2 !c 2 j u — -j (xA x + yA y + ±A Z ) -f- e<D (2.5) 


where <D is the scalar potential, and A = ( A x , A y , A z ) the vector 
potential. The generalized momentum conjugate to coordinate x is 


d2? _ mx 

dx (1 — v 2 /c 2 )‘l‘ 



( 2 . 6 ) 


and similarly for the other coordinates. Hence the generalized 
momenta p x , p y , pz do not coincide with momentum components 
P x , Py, P z but are linked with them, as in nonrelativistic case, by 
the relationships 


P* = Px + iA x , Py = Py + ^Ay, P z = p z + ^A Z (2.7) 
[see formula (5.16), Part III]. The energy of the electron is 
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Expressing it in terms of the generalized momenta, we find 
the classical Hamiltonian function 

tf classical = me 2 [l + (p + -f a) 2 ]' A - (2.9) 

3. Derivation of the wave equation 

We seek a quantum operator that corresponds to the Hamil¬ 
tonian function (2.9). Let us start from the simplest case of a 
free electron, that is, when there is no electromagnetic field and 
the scalar and vector potentials are zero. Then 

tfc.ass.cal = ^ 2 ( 1 + (p 2 + p 2 + p 2 )) V ’ (3.1) 

Because equations of the theory of relativity are symmetric 
with respect to coordinates and time and because the wave equa¬ 
tion contains the linear operator of differentiation with respect to 
time, the equation must also have linear operators of differentia¬ 
tion with respect to coordinates. Consequently, the quantum Ha¬ 
miltonian must be linear with respect to the operators 

d .f. d d 

P* = - lh W Py = - lh -dP P^~ lh ~dJ 

that is, it must have the form 

tf = PlP* + $ 2 py + P3 Pz + P4 ( 3 . 2 ) 

where p* are the as yet unknown operators that do not depend 
on p x , Py , pz. But these operators must not contain the coordi¬ 
nates x, y, z either because all points in space have equal status 
for a free electron. Consequently, they must act on some new vari¬ 
ables on which the wave function in Schrodinger’s theory did not 
depend. We will determine the meaning of these new variables 
later. We will see that they are a generalization of the operators 
in Pauli’s theory. 

To determine the properties of operators p* we require that 
there be the same relationship between the square of the energy 
and the square of momentum of a free electron in quantum me¬ 
chanics as in classical mechanics, namely 

tf 2 = mW + c 2 {p\ + P \ + P l) ( 3 . 3 ) 

Let us calculate the square of operator (3.2), bearing in mind 
that the p* do not contain the coordinates and hence commute 
with Px , Py , Pz but cannot commute with each other. We get 

tf 2 = PH PX + PlP 2 + PljP; 

+ (P1P4 + P4P1) P x + (P2P4 + P4P2) P y + (P3P4 + P4P3) Pz 
+ (P2P3 + P3P2) PyPz + (P3P1 + P1P3) P z P x + (P1P2 + P2P1) PxPy ( 3 . 4 ) 
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This expression coincides with the previous one if the following 
conditions are observed: 

= P? = = P| = <% PiPfc + P*P f = 0, i¥=k (3.5) 
If we use the relationships 

Pi = ca u P 2 = ca 2 , p 3 = ca 3 , p 4 = mc 2 a 4 (3.6) 

to introduce new operators ct* proportional to p*, then the Ha¬ 
miltonian will be 

H = c (a,p* -f a 2 p y + a 3 p z ) -f mc 2 a 4 (3.7) 

and the a* must satisfy the conditions 

a\—\, a t a k + a k a t = 0, i # k (3.8) 

which in shorter form can be written as 

a,a k + a k a { = 26 lk , i, k — 1,2, 3, 4 (3.9) 


4. The Dirac matrices 

We will see later that the a* can be considered as transforming 
four functions i|>i, ^ 2 , ^ 3 , tj> 4 , in the same way that the Pauli 
matrices transform two. The object on which the a* act will then 
be a collection of four functions, and the operators can be repre¬ 
sented in matrix form with elements that are the coefficients of 
these transformations. 

We will often denote the collection tpi, ^ 2 , t|> 3 , ^4 simply by \j) 
and the transformation 

< = “ll% + “12% + “l3% + “.4% 

% = “21% + “22% + “23% + “24% 

% = “31% + “ 32 *% + “33% + “34% 

% = “41% + “42% + “43% + “44% ( 4 - 1 ) 

by the brief notation 

% — a\J) (4.2) 

where, consequently, a is the matrix 

( “11 <*12 a I3 a 14 

“21 “22 “23 “24 

®31 “32 “33 “34 

“41 “42 “43 “44 

Let us express operators ai, a 2 , a 3 , cu, which satisfy (3.9), in 
terms of matrices similar to those in Pauli’s theory (see Part III). 





Dirac’s Theory of the Electron 


285 


From oi, a 2 , a 3 , a 4 we construct six matrices. First, three 

a x — — ia 2 a 3 , a y — — ia 3 a u o z = — m,a 2 (4.4) 
and then, three more: 

Pa = — /a, 02 a 3 , p A = a[a 2 a 3 a 4 , p c = a 4 (4.5) 

We can easily verify that matrices a x , a y , o z will satisfy the same 
relationships as the Pauli matrices, namely 

OyO z — *“ OgCfy 1 ’ ICfjQ 

o z o x = — o x o z = io y 

o x o y = — o y o z — io z (4.6) 

The square of each of these-will be unity: 

°\ = ^ = = 1 (4.7) 

The matrices p a , p b , p c will satisfy similar relationships: 

PftPc == — PcPfc == iPa 
PcPa == PaPc — iPb 

P aPb = ~ PftPa = IPc (4.8) 

and 

P^ = Pl = P 2 c =l (4-9) 

The products of matrices p into matrices a are 

p a O x = O^Pa = Q| 

P a a y = a yPa — a 2 

Pa°z=OzPa = <* 3 (4.10) 

further 

P b or x = o x p b = za,a 4 

P b°y — a yPb “ ta 2 a 4 

Pb a z — a zPb — * a 3 a 4 - (4-11) 

and, last 

Pc°x = OxPc = — 

p c o J r = a 1 ^a = —ta 3 a,a 4 

p c a 2 = cy> c = — fa,a 2 a 4 (4.12) 

Hence each of the matrices p commutes with each of the mat¬ 
rices 0 , and we can say in a sense that p and o refer to different 
degrees of freedom of the electron. 
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Using matrices a t expressed in terms of p and a, we can write 
the Hamiltonian (3.7) as 

H = cp a (a x p x + t r y py + o z pz) + mc 2 p c (4.13) 

We note that to the four matrices ai, a 2 , a 3 , a 4 , which satisfy 
(3.9), we could add a fifth, say, as = p*>. This last anticommutes 
with the Hamiltonian (4.13). 

Now let us turn to the problem of building 4X4 matrices that 
have the general properties we have just stated. We first consider 
the three 2X2 matrices that we met in Pauli’s theory. 

If we denote them by a° v crSj, ct°, we have 



Let us assume that the transformations 


• cm ;)- on)- <-> 

are applied not to one but to two pairs of numbers,^ ^ and ^ \ 

simultaneously. These two pairs can be considered to be a set of 
four numbers tpi, v|> 2 , ifc, ^ 4 . We can relate the numbers |, tj, 
to ij>i, ij? 2 , ^ 3 . ^4 in different ways. 

We can put, for instance 


'I’i = £> 

& 

II 

-=s 

^=r. 

II 

.3 

* 

(4.16) 

or we can put 

♦1 = 6 . 

% = £*, 

= 

\|)4 = 11 * 

(4.17) 


In the first case the matrices corresponding to our transforma¬ 
tions will be 



0 1 

0 0 



10 

— 

i 0 

0 


1 

0 

0 

0 


1 0 

0 0 



i 

0 

0 

0 


0 

-1 

0 

0 

Oi = 

0 0 

0 1 

> 

°2 ~ 

0 

0 

0 

— i 

. 03= 

0 

0 

1 

0 


0 0 

1 0 



0 

0 

i 

0 


0 

0 

0 

-1 
(4.18) 

and in the 

second 

case 










0 0 

1 O 1 



0 

0 

— i 

0 


1 

0 

0 

0 


0 0 

0 1 



0 

0 

0 

— i 


0 

1 

0 

0 

Pi = 

1 0 

0 0 

t 

P2 =r 

i 

0 

0 

0 

. P 3 = 

0 

0 - 

-1 

0 


0 1 

0 Oj 



0 

i 

0 

0 , 


.0 

0 

0 

-1 


(4.19) 
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Obviously, the transformations ai, 02, 03 and the transforma¬ 
tions pi, p 2 , p 3 , taken separately, satisfy the same relationships as 
the transformations (4.15), which are applied to two functions. 
Namely 

02^3 == — O 3 O 2 = tO 1 

O 3 O 1 = — CTiOT 3 = io 2 
O] O 2 — — cr 2 cr 1 = io 3 

a 2 ==a 2 = a 2 ==l (4 20 ) 

and 

P2P3 — — P3P2 = f Pi 
P3P1 — — P1P3 — J P2 
PlP2 = — P2P1 = fp 3 

P? = Pi==p|= 1 (4-21) 

On the other hand, we can verify that each transformation o com¬ 
mutes with each transformation p, so that 

<*iPk — Pk°i, i, k= 1,2,3 (4.22) 

Each of the matrices p and o and each of their products (4.22) 
has two 2-fold degenerate eigenvalues, +1 and — 1 . 

The three matrices a,-, the three p,, and the nine a,p* together 
with the unit matrix form a system of 16 matrices, which can be 
called complete in the sense that any 4X4 matrix, that is, a 
matrix with 16 elements, can be expressed as a linear combination 
of these 16 matrices. The coefficients in this combination are num¬ 
bers. 

For example, we can use p, and a, to express the matrices in¬ 
volved in the Dirac equation and the related matrices, p a , p*, p c 
and a x , o y , a z . We can do this in various ways, so that matrices 
with a given physical meaning can have different mathematical 
forms. In the literature we most often 'find the representation in¬ 
troduced by Dirac, who put 

o x = o h o y = o 2 , o z — <t 3 , p a = pi, p 6 = p 2 , p c = p3 (4.23) 

According to (4.10) and (4.15), the corresponding matrices a* 
will be 

a i = Pl CT i* a 2 ~ Pl 0 ^’ a 3 = Pi<V a 4 = P 3 (4.24) 

(We have assigned a prime for each matrix so as to distinguish 
these from the ones that we will use later.) 

In some respects it is more convenient to use the following 
matrices: 

== P 3 *-h» °y — a 2 ’ O z — p 3 a 3 

Pa = P3, P6 = Picr 2 , Pc = P 2 O 2 (4.25) 
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or in explicit form 



0 

1 

0 

0 


0 

— i 

0 

0 


1 

0 

0 0 


1 

0 

0 

0 


i 

0 

0 

0 


0 - 

-1 

0 0 

a x = 

0 

0 

0 

-1 

I Oj, — 

0 

0 

0 

—i 

. o z = 

0 

0 

-1 0 


0 

0 

-1 

0 


0 

0 

i 

0 


0 

0 

0 1 














(4.26) 


1 

0 

0 

0 


0 

0 

0 

— i 


0 

0 

0 -n 


0 

1 

0 

0 


0 

0 

i 

0 


0 

0 

1 

0 

Pa = 

0 

0 

-1 

0 

> Pb ~ 

0 

— i 

0 

0 

. Pe = 

0 

1 

0 

0 


0 

0 

0 

-1 


i 

0 

0 

0 


-1 

0 

0 

0 


(4.27) 


From this we get.the following expressions for a k : 

a, = 01 , a 2 — p 3 a 2 , 03 = 03 , a 4 = p 2 o 2 (4.28) 


or in explicit form 

0 10 0 


03 — 




f 

0 

- 

- i 


0 




i 


0 


0 

0 

a 2 = 


0 


0 


0 

i 



0 


0 

— 

i 

0 / 


f 


0 

0 

0 

- 




0 

0 

1 


0 1 

a 4 — 



0 

1 

0 


° 


\ 

— 

-1 

0 

0 


0 / 


(4.29) 


5. The Dirac equation for a free electron 

We can now write the Dirac equation for a free electron in 
explicit form. If H is the operator (3.7), the wave equation 


: ih 


di|) 


= [c (a X p x + a 2 p y + a 3 p z ) + mc 2 a 4 ] i\> — &t 
can be written as a system of four differential equations: 

dt 
d$2 
dt 


d<f>t 

dt 


— ihc ( 

' 5^2 
^ dx 

. dti> 2 j 
dy ' 

1 ^4), 

r dz 

) — mc\ |j 4 = ih 

— ihc ( 

' _ 
v dx 


d^2 

dz 

) + mc 2 ^ 3 = ih 

— ihc ( 

f d$L. 

k. dx 


1 dt|33 

r dz 

) + mc 2 i\> 2 — ih 

— ihc | 

k. dx 

. <5i|)3 

1 dy ' 

_ 

dz 

- j — mc 2 ty 1 = ih 


(5.1) 


(5.2) 
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We note that it is easier to investigate the Dirac equation when 
it is written in the short form (5.1), so that there will be almost 
no need to use (5.2). 

We have considered two ways of choosing matrices. One, pro¬ 
posed by Dirac, corresponds to formula (4.23). The other, proposed 
in this book, corresponds to (4.25). 

For some purposes it is convenient to introduce such a repre¬ 
sentation of matrices a* as will ensure that the corresponding 
system of equations for the four component wave function of the 
free electron has real coefficients. It is sufficient to interchange 
p 2 and p 3 in (4.25) and change the sign before matrix pi. Instead 
of (4.25) we will then have 

o° x = P 2 o v o° y = a v °“=P 2 <J, 

Pa = p2> 9 b = — P,® 2 . P° = p 3<*2 (5.3) 

To distinguish the new matrices from the old we assigned them 
the index 0. All matrices (5.3) have pure imaginary elements. 1 

The new matrices and the old are connected via a canonical 


transformation with matrix 

T — ~j=~ ( P 2 + P 3 ) (5.4) 

which is hermitian and unitary, so that 

T- l = T + = T , T 2 = 1 (5.5) 

Indeed, we have 

Tp 2 T = p 3 , Tp 3 r = p 2 , Tp{T =— Pi (5.6) 

The new matrices a k (which we denote by a*) will be connected 
with the old matrices through the canonical transformation 

a° k = T + a k T (5.7) 

These will be 

«? = <*!- «2 = p2 CT 2- a 3 = <V a 4 = P 3 °2 M 


A comparison with (4.28) shows that they differ from the old in 
that a 2 is interchanged with a 4 . The elements of the first three 
matrices will be real, and the elements of a° pure imaginary. 


1 Since we .are dealing with 4X4 matrices, there is no danger of confusing 
them with the Pauli matrices (4.14). 


19—2186 
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We can now write the system of four differential equations for 
the wave function of a free electron, (5.2), as 


0*2 


me 


dx 

dy 

+ 

dz 

+ 7 

dt 

+ ~r 

0 *? , 

, 0*3 


0*2 

1 

-4- 

0*2 

me 

dx 1 

dy 


dz 

+ e 

dt 

fi 

0*4 , 

a*» 

+ 

0*3 

1 

-4- 

0*3 

me 

dx ^ 

1 dy 

dz 

+ c 

dt 

h 

0*° 

0*? 


0*5 

+ ^r 

0*? 

me 

+-=- 


*2 = 0 


*? = 0 


*o = 0 
*o = 0 


(5.9) 


Last, let us write in explicit form the relationship between the 
wave functions *' ft , which correspond to the choice of matrices a' k 
according to Dirac [see 4.24], and our wave functions. We have 

*1 — *i _ *2 + *3 


*1 


V^ 

*i + *4 

V2" 


*; 


V2 

*2—4 


V2 


(5.10) 


We introduce the unitary matrix S that corresponds to the trans¬ 
formation 

*'= S* 


This matrix is 


5 = 




0 

1 

0 

-1 


-1 

0 

1 

0 


1 — tp 2 1 + lOz 1 — /p 3 g 2 

ViT V2 V2 


(5.11) 


(5.12) 


As for the relationship between *° and *, which corresponds to 
the transformation 

*° = f* (5.13) 

this is given by the formulas 




+ (5.14) 


Since T 2 — 1, the same formulas hold for *; and *^. We have 
+» = W ~ **§)» ^2 = ^ O2 “ ^4) 

% = + = ^ + ^2) (5.15) 
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6. Lorentz transformations 

We will now prove the invariance of the wave equation under 
Lorentz transformations and investigate the geometric properties 
Of t|>l, t|) 2 , 'K ^4- 
We put 

x — x u y = x 2 , z — x 3 , ct — Xo (6.1) 

and introduce four numbers: 

e 0 — 1, e, = e 2 — e 3 = — 1 (6.2) 


so that we can write the square of the four-dimensional separation 
in the form 

3 

±ds 2 = c 2 dt 2 —- dx 2 — dy 2 — dz 2 = X e k dxl (6.3) 

t>0 

We write the Lorentz transformations as 


x i e k a ik x k 




(6.4) 


(6.5) 


where a<* are real numbers that satisfy the condition 

3 

2 e t a t ift U = e k b kl 
z-o 

These conditions follow from the fact that transformations (6.4) 
must leave ds 2 invariant. Owing to these conditions the solution 
of (6.4) for xi yields 

3 

(6.6) 

which in turn leads to the equations 


£ ei a ki a u — e ifiki 

i=> 0 


(6.7) 


If we multiply Eq. (5.1) by i/{fic), we can write it in the form 


3 ' 

Z <5t|> , imc . - <5ib A 

a ^ + — o ^+iz- ==0 




or 


Z W , imc . n 


(6.8) 

(6.9) 


fe = 0 


provided that ao, is the unit matrix. 
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Now let us change variables according to (6.6). We have 


dip v' dib 

= 2^77 


dx 


t-o 


Hence 


3 3 


1-0 k -0 


E r> di|) , imc , n 

2, e hO-ik^k ~£r + — a# = 0 


( 6 . 10 ) 


If we can find a matrix S (not a unitary one, generally speak¬ 
ing) such that 

3 

a'i = S + a l S = £ e k a tk a k , 1 = 0, 1,2,3 (6.11) 


4=0 


and 


S + a 4 S = a 4 

then Eq. (6.10) can be written thus: 


( 6 . 12 ) 


V S + a t S + SSL S + a 4 Si|> = 0 

X—I dx. h 


1-0 


dx\ ‘ fi 


(6.13) 


If we then put 

i|/ = Si|> (6.14) 

and premultiply (6.13) by (S + ) -1 , that is, initiate a transfor¬ 
mation of the four equations (6.13) inverse to S+, we get 

j] a*|^ + -^M>'==0 (6.15) 

to dx * n 

that is, an equation of the same type as the initial one, (6.9), 
with the same matrices a* but with new independent variables 
x' 0> .v', x', x' and new functions if[, op', of', ip'. Hence we will be 
able to prove that 

If the Lorentz transformations are accompanied by transfor¬ 
mation (6.14), which acts on functions if, the wave equation 
will keep its form. 

In other words, we will prove the invariance of the wave equation 
under the Lorentz transformations. 
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7. Form of matrix S for spatial rotations 
of axes and for Lorentz transformations 


We will show that a matrix 5 with the needed properties does 
indeed exist and that with our choice of matrices a* it has the 
form 


5 = 


ct P 0 0 \ 

Y 6 0 0 

0 0 a p 

0 0 y * ' 


(7.1) 


where a, p, y> 6 are four complex parameters connected by the 
relationship 

a6 - Py = 1 (7.2) 


These parameters are called the generalized Cayley-Klein para¬ 
meters. The hermitian conjugate of 5 will be 


S + = 



(7.3) 


First of all it is easy to prove by direct calculation that Eq. (6.12) 
is an identity owing to the relationship (7.2) if we perform three 
transformations successively: first S, then a 4 , and last S + . Note 
that we also have 

S + a s S = a 5 (7.4) 


To make certain that for any Lorentz transformation we can 
choose parameters a, p, y. 6 so that Eq. (6.11) will hold, we will 
use the fact that the Lorentz transformations and the transfor¬ 
mations S each form a group, that is, that several successive 
transformations can be replaced by one of the same type. We can 
obtain the most general Lorentz transformation equations by 
applying successively transformations that are simple in form. 
For instance, we can rotate the coordinate system about the 
x, y, z axes and then apply the transformation 


, z—vt , t — vz/c s 

2 - (, _ ’ (1 -W 4 


(7.5) 


If we then find the matrix S for each of these transformations, the 
matrix 5 of the general transformation can be obtained by mul¬ 
tiplying the matrices of all the transformations. 
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Let us first study the rotation about the 2 axis, since in this 
case S has the simplest form. We write the transformation equa¬ 
tions as 

x' — .V, cos qp — x, sin qp 
x' = x { sin <p -f x 2 cos cp 


= (7-6) 

We will show that the parameters corresponding to this rotation 
are 

a = e-ivl2 t p = 0 , v = 0, 6 = ^/2 (7.7) 


so that the matrix of this transformation is 


S = 



. —/<P /2 


We can write it also as 


/(p/2 


S = cos -I— i sin 


J<PI2 


/<p/2 


ffi 

fPM 


(7.8) 


(7.9) 


or, according to (4.25), as 


S = cos-|- — (sin-|-cr 2 (7.10) 

We have 

S + a { S = S + p a o x S = p a (cos + i sin ^ 0 ^ a x (cos — i sin 0 Z ) 
= p a (cos qp + / sin qp 0 z ) a x = p a ( 0 * cos qp — o y sin qp) 
so that 

cti = S + ctiS = ai cos qp — a 2 sin qp (7.11) 


and in a similar manner 

02 == S + a 2 S = a, sin (p + a 2 cos qp 
a 3 = S + a 3 .S = a 3 

a'o — S + a 0 S = Oq (7.11*) 


Thus the new matrices are expressed in terms of the old a* 
in the same way as the new (transformed) coordinates x' k are 
expressed in terms of the old Xu, which means that relationships 
(6.11) hold true. 
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Now we study the rotation about the axis x = X\: 

x[ — x 2 cos cp — x 3 sin tp 
x' 3 = x 2 sin <p -j- x 3 cos <p 


*0 *0 

Here 

(7.12) 

a = cosy, p= — «'sin-|, y = — isiny, 

6 — cos ~ (7.13) 

so that the transformation matrix 


S = cos — i sin p 3 cr, 

(7.14) 

or 


S = cos — i sin -y a x 

(7.14*) 


To prove that (6.11) is valid in this case we can use the same 
method as in the previous case. 

Finally, for the rotation about the axis y — x 2 we have 

x\ — x , cos cp -{- x 3 sin qp 

X = X 

x' 3 = — x, sin cp + x 3 cos qp 

x' 0 == x 0 (7.15) 

The parameters in this case are 

a=cos-|-- P= —sin-|-. Y = sin-|-> 6 = cosy (7.16) 

If we use these parameters to build matrix S for this rotation, 
we get 

S — cos f sin y <x 2 (7.17) 

or 

S = cos — i sin — a y (7.17*) 

In all three cases a rotation about axis Xk by an angle cp in the 
positive direction has corresponding to it a unitary matrix 

5 = cos y — i sin o Xft (7.18) 

where the result does not depend on the choice of matrices a*. 
To generalize the result we can say that a spatial rotation by an 
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angle co about an axis with direction cosines l, m, n has corre¬ 
sponding to it a matrix 

S = cos y — i sin -y ( lo x + mo y -j- no z ) (7.19) 


We now turn to the Lorentz transformation equations proper, 
(7.5), which we write as 

x',—x., x' = x 2 


We put 


,_ *3 — VXjc , Xp — PS3/C 

3 (l-l> 2 /c 2 )‘ /j ’ ° (1 - u 2 /c 2 ) Vl 


— = tanh« 

C 


(7.20) 

(7.21) 


so that 

(1 — v 2 /c 2 )~' l:t — cosh u, (v/c) (1 — v 2 /c 2 )~' ,t = sinh u (7.21 *) 
The parameters in this case are 

a = e~“/ 2 , p = 0, v — 0. 6 = e ul2 (7.22) 

Here matrix S will no longer be unitary but with our choice of 
matrices a* will remain of type (7.1). We write it as 


S = cosh ~ 

— sinh-|-a3 

(7.23) 

S = cosh y 

— sinh y a 3 

(7.24) 


It is easy to verify that now too the relationships (6.11) hold. 

For the Lorentz transformation corresponding to the motion 
along the x* axis with the velocity v — c tanh u the matrix S is 

S = cosh — sinh a k (7.25) 

and for the transformation corresponding to the motion along the 
line with direction cosines l, m, n 


S = cosh — sinh —■ (la { + ma 2 + na 3 ) (7.26) 


Hence in all cases we can find a matrix S of type (7.1) that sat¬ 
isfies relationships (6.11). We have thus proved the invariance 
of the wave equation under a Lorentz transformation. 

We note that the Cayley-Klein parameters and hence matrix S 
are determined for a given rotation to within their sign. In our 
formulas the sign of matrix 5 has been chosen so that an infini¬ 
tesimal rotation, or a Lorentz transformation with an infinitesimal 
velocity, has corresponding to it a matrix that differs from S= + l 
by a second-order infinitesimal. 
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8. Current density 

We see that to each Lorentz transformation there corresponds a 
definite (to within its sign) transformation of the functions 
$ 1 , ^ 2 , ip 3 , ^ 4 - These functions therefore represent a kind of geo¬ 
metric quantity, such as a vector or tensor. We can call this 
quantity a tensor of rank l / 2 or half-vector. The justification for 
this name is that some quadratic forms of if are transformed as 
a four-dimensional vector. Indeed, let us put 

A k = k — 0, 1, 2, 3, 4, 5 (8.1) 

or in explicit form (with our choice of a*) 

A 0 =-f ^M>2 4- iM>3 + Mu 

Ai = -f *M>i + 

A 2 — — + ^M>4 — 

A 3 = ipi^I — ^2 + ^3 — iM>4 
A 4 = — + (M>3 + ^>3^2 — $4*1 

A 5 — — i^ 4 + iSM> 3 — «M>2 + *M>i (8.2) 

Formulas (6.11) then give 

3 

== ^a 4 **/^** f = 0, 1, 2, 3 (8.3) 

These equalities show that A 0 , A\, A 2 , A 3 transform like the com¬ 
ponents of a four-dimensional vector. The formulas (6.12) and 
(7.4), on the other hand, give 

a ' 4 = A 4 , A 5 — A 5 (8.4) 

which means that the quantities A 4 and A 3 are four-dimensional 
invariants. The quantities Ak are linked thus: 

A] Al A 3 4" Al -j- A 5 = Ao (8.5) 


Let us show that if the Ak are built by means of the functions t|>, 
which satisfy the wave equation, we have the following equation: 


dA\ j dA% 1 dA s 1 &Aq 

dx ' dy ' dz c dt 


( 8 . 6 ) 


For this we write Eq. (6.9) and its complex conjugate: 



0 


(8.7) 


(8.7*) 
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We premultipiy the first of these equations by \j> and the second we 
multiply into tj>, and we add the results. The second terms in the 
equations cancel out, and we get 

3 

E-5T (***+) “° (8.8) 

k=o * 

that is, we come to Eq. (8.6) 

If we now integrate (8.6) over a volume V encompassed by a 
surface a, we get 

-jy- ^ A 0 dr = — c ^ [Ai cos (n, x)-\-A 2 cos ( n , y)+A 3 cos (n, z)\ dx (8.9) 
v 

The physical meaning of A 0 is the probability density. In Schro- 
dinger’s theory this is tjn|5. A comparison of (8.6) and (8.9) with 
(2.7) and (2.9) of Chapter III, Part II, shows that vector cA* 
plays the role of vector S in Schrodinger’s theory and is the 
counterpart of the electron flux. Hence the classical charge den¬ 
sity p and current density pv have corresponding to them the 
quantum counterparts 

p -> — = — eA 0 (8.10) 

pv k -* — eS k — — = — ecA k , k—\, 2, 3 (8.11) 

9. The Dirac equation in the case of a field. 

Equations of motion 

The wave equation for the free electron introduced in previous 
sections had the form 

0 (9.1) 

with the Hamiltonian 

H — c (a X p x + a 2 p y + a 3 p z ) + mc 2 a, (9.2) 

We must now generalize this equation for an electron in an 
electromagnetic field. The classical Hamiltonian function, (2.9), 
can be found from the function in the case of no field, provided 
that we add to the latter the potential energy — eq> and change the 
generalized momenta p x , p y , p z to momenta P x , P y , P z according 
to (2.7) We did this in Pauli’s theory. Let us try the same substi¬ 
tution in our relativistic Hamiltonian (9.2) and put 

H — c [cqP* -f a 2 Pj, + a 3 P z ] + me 2 a 4 — e<f (9.8) 

Px = Px + jA x , Py = Py + ^A y , P z = Pz + ^A z (9.4) 


where 
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and where p x , p y , p z are understood to be the operators 




-iti 


d_ 

dz 


The main justification for this transition from the equation with¬ 
out a field to the equation for an electron in an electromagnetic 
field is this. The directly observable physical quantities are the 
electric and magnetic fields & and X. The potentials, on the other 
hand, are auxiliary mathematical functions determined only to 
within the transformations 

A' = A + grad/, <p' = q> — 7-57 (9.5) 


which leave the field unchanged. For this reason we must require 
that all physical effects that follow from the wave equation remain 
unchanged when we substitute A' and <p' for A and 9 . This re¬ 
quirement will be met if such a substitution has corresponding to 
it a unitary transformation of operators and wave functions. Let 
us show that this proposition is true if the Hamiltonian is of 
type (9.3). 

Let us denote by H' an operator of type (9.3) in which A' and 9 ' 
are substituted for A and 9 by means of (9.5). If we use relations 
of type 

[~ ih w + T + + T 

(9.6) 

we can easily show that, if if satisfies (9.1), the function 

= (9.7) 

will be the solution to 

(9.8) 

Thus adding a gradient to the vector potential Is equivalent 
to introducing a phase factor into the wave function, which is a 
particular case of a unitary transformation. 

The wave equation (9.1) with the Hamiltonian (9.3) will ob¬ 
viously be invariant under a Lorentz transformation because the 
vector potential is transformed in the same way as the gradient, 
and the scalar potential as —(1 /c)(d/dt). Besides, and this can 
easily be shown, Eq. ( 6 . 8 ) is still correct. 

How can we prove the validity of the new wave equation? A for¬ 
mal proof is to consider the equations of motion. We recall the 
formula for the total time derivative of an operator L [see (13.22), 
Chapter III, Part I], that is, 

■§—#■+ im-m 


( 9 . 9 ) 
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Let us see whether substituting x, y, z, P x , P y , Pz successively 
for L brings us to the classical equations of motion, which was 
the case in Schrodinger’s theory. We put 

L — x, y, z 

and obtain 

• dx . dy . -dz /n 

x=~jf = ca u y — ~fu~ ca 2 > z = - 37 - = co 3 (9.10) 

These are the operators for the velocity components of the 
electron. They do not commute with each other, and the square of 
each is c 2 , that is, the square of the speed of light. The eigenval¬ 
ues of the three operators are ±c. It turns out then that measure¬ 
ment of a component of the velocity of an electron is certain to 
lead to the result ±c. The question of whether this paradoxical 
result has a physical meaning remains open. The author is in¬ 
clined to consider this result as a defect of Dirac’s theory. 

Let us now put 

L — Px — Px + ^Ax 


and then calculate dPJdt. For this we first find 



-W-yl 

> 

n 

i 

dAy' 


< dy 

dz / 

* c 

i( p z p x 


Ql> 

X 

I 

dA x N 


, dz 

dx > 

* c 

X ( p x p y 


<dAy 

dA x ' 

ul 

dx 

dy / 

' c 


'yp 

<™X 

dGy 

'V/? 


On the basis of this we have 


(9.11) 


fr = TT + £ fa ( p « p * - W + «3 (P z Px - P x Pz)] 
— X (<f>Px — P*<P) 

- < 6 (t ■ d -W + 19 - e “ 2 ^ + ea ^y 


or 

HP 

-=£f- =-e (a 2 ^ z - a - eS x (9.12) 

The last step is to introduce by means of (9.10) the operators 
x, y, z and write for dPy/dt and dP z /dt equations similar to (9.12). 
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We get 

= (y% z - ±X,) - e$ x = F x 

^jr = - 7 {tx x ~ X3V Z ) - eS y = F„ 

■^r = -T (*M y ~ yW x ) - eS z = F Z (9.13) 

These equations coincide with the classical equations (2.2). 

10. Angular momentum and the spin vector 
in Dirac’s theory 

Let us consider the time derivatives of the operators a x , a y , a Zl 
which enter into the generalized Pauli operator. For this it is 
convenient to express, by means of (4.5) and (4.10), the matrices 
a* in terms of p and a. We can then write the Hamiltonian (9.3) 
as 

H = cp a (a x P x + OyPy + a z P z ) + me 2 p c — e<P (10.1) 

Recalling that the o’s commute with the p’s, we find the time de¬ 
rivative by means of the general formula (9.9): 

_ <j x Oy) P u + (a z a x — a x a z ) P z ] 

Whence, if we use the properties (4.6) of the o’s, we get 

■^ = T-9a(o z Py-a y P z ) (10.2) 

Finally, if we write this formula together with the other two sim¬ 
ilar formulas, we get 

If—jP. + iP, 

B d<Jy 

t nr zP * xP * 

T J w = -* p y+y p x (10.3) 

where we have used the equations of motion (9.10) and also the 
relationships (4.10), which express the a* in terms of p and a. 

In classical mechanics P x , P y , P z are proportional to x, y, i, 
which means the right-hand sides in (10.3) would vanish. The 
left-hand sides would also vanish because the transition to clas¬ 
sical mechanics is equivalent to h being zero. We note that the 
order of the multipliers in (10.3) is irrelevant. 
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To return to quantum mechanics. We write the right-hand side 
of the first equation in (10.3) as 

d dP, dP u 

- yPz + zPy=—at(yPz - zP y ) + y~dT ~ z ~w 

If we now substitute the expressions for P y and P z by means of 
Eqs. (9.13), we find that Eqs. (10,3) yield 

±(yP 2 - z P y + ±.o x ) = yF 2 -zF u (10.4) 

and similarly 

4r ( zP * ~ xP z +T a v) =zF * ~ xF * 

^-(xPi-yP' + A-oJ = xFy yP x (10.4*) 

We can interpret these equations as the counterpart of the law 
of classical mechanics that states that the time derivative of angu¬ 
lar momentum is the torque, or the moment of force. The angular 
momentum in our problem is 

^x = yPz — zP y + Y a * 

jfl y — ZP X xP z ~| jj - G y 

Jt z ^xP y -yP x + ^o z (10.5) 


This is a generalization of the expressions that we studied in de¬ 
tail in the part devoted to Pauli’s theory. The relations (10.5) 
transform into those of Pauli’s theory if we nullify the vector po¬ 
tential (so that Px = Px , Py — Py, Pz = Pz) and represent the 
operators cr*, Oy, a z in the form of the 2X2 Pauli matrices. 

Let us now try to generalize the operator 

P — a x P x + OyPy + a z P z (10.6) 


(We studied this operator in Pauli’s theory.) We construct the 
time derivative of P. If we turn to the Hamiltonian (10.1), we see 
that we can express it in terms of P as 

H = cp a P + mc 2 p c — e<D (10.7) 


This implies that the only term in H that does not commute 
with P is the one with the scalar potential. Whence 


dP dP ie _ _ 

nr ~ nr +t ' 


®p>=4(c. 


dA x 

dt 


dA u 


+ a y nr + a 


dA z 

m 


) 


, ( d<D . <?a> . dd> \ 

+ e C aj( dx + a » dy + °* dz ) 
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or 


-jf = — e (o x 8 x + a y 8 y + a z 8 2 ) 


( 10 . 8 ) 


Thus the time derivative of P is proportional to the scalar prod* 
uct of the electric field 8 and the spin vector o. When the elec¬ 
tric field is zero, P is a constant of the motion. Note that in 
Pauli’s theory we would have obtained the same equation of mo¬ 
tion for P, but there the Hamiltonian depended on P quadratically 
and not linearly. 

In Pauli’s theory we also dealt with the operator 

A — o x m x + o y m y + o z m z -f ft (10.9) 

We likewise showed that this operator anticommutes with P. For 
this reason in Dirac’s theory A will not be a constant of the mo¬ 
tion even for a free electron. But since matrix p c anticommutes 
with matrix p a , which stands in the first member of the Hamilto¬ 
nian (10.7), and commutes with the other two members of (10.7) 
the operator Ad = p c A will, in the absence of a field, commute 
with all the members of (10.7) and thus will be a constant of the 
motion. 

When there is no electromagnetic field, A can be written as 

A = a x A x + o y A y -\-o z A z -±- (10.10) 


where Jl x , A y , A z are determined by (10.5). We will assume 
that A is determined by (10.10) also when the momentum opera¬ 
tors P x , Py ,Pz contain the vector potential. Let us construct the 
expression for the total time derivative of operator Ad — 
— p C A for this (general) case. We will have 


dA D _ d 
~di~ z= 'di 




= ~ epc K (y& z-z&y) + Oy (z& x “ x8 z ) + o z (x 8 y - y8 *)] 
+ cp b [°* {y^z — z2t&y) + cty (z3$ x — x3@z) -J- a x (xSffi y — y3 $ x )] 

( 10 . 11 ) 


The right-hand side vanishes not only in the absence of a field 
but also when the magnetic field is zero and the electric field is 
directed along the radius (a central field), which is an important 
case for applications. We will consider the problem of an electron 
in a central field in the next chapter. 
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11. The kinetic energy of an electron 

If in (9.3) or (10.1) we drop the term with the scalar potential, 
we come to the operator 

T — c(a. l P x + a 2 P y + a 3 P z ) -f mc 2 a 4 (11.1) 

or 

T = Cp a ( a x P x + o u Py + a z P z ) + me 2 p c (11.2) 

which can be interpreted as the kinetic-energy operator in con¬ 
trast to the Hamiltonian H, which is the total-energy operator. 
This new operator is the quantum counterpart of the classical 
quantity 

T =-= c (m 2 c 2 -f p 2 )' A (11.3) 

where v is the speed of the electron, and p is its momentum. 
Such an interpretation is justified by (a) that there is an analogy 
between the quantum and classical expressions for the time de¬ 
rivative of T and (b) that the eigenvalues of T are greater than 
me 2 in absolute value. This analogy becomes still greater if in¬ 
stead of operator T we associate with the classical kinetic energy 
the mean value of the Heisenberg matrix for this operator. We 
will find tjie mean value of the matrix in the next section. 

Let us first construct the time derivative of operator T. By the 
general formula we have 

4r=-|f+ h HT ~ TH > 01.4) 

where 

H — T — e<b (11.5) 


But T depends on time only through the vector potential, which 
appears in the operators P x , P y , P z . On the other hand, the only 
term in H that does not commute with T is —ed). For this reason 



( 11 . 6 ) 


or 

4r = - ec + a ^y + a ^) ( 11 - 6 *) 

But in Section 9 we have seen that according to the Dirac equa¬ 
tion 

x = ca u y = ca 2 , 2 = ca 3 (IL7) 
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Whence we can write (11.6*) as 

"1 = -*(**,+ #r, + *ir,) (11.8) 

* 

This equation formally coincides with Eq. (2.3) 

To make sure that the eigenvalues of T exceed me 2 in absolute 
value we must find the square of T. If we use (11.2) and the prop¬ 
erties of the matrices p, we get 

T 2 = m 2 c* + c 2 P 2 (11.9) 


The second term on the right-hand side is c 2 times the square of 
the hermitian operator 

P = o x P x + a y P y + a z P z (11.10) 

[we studied this operator in the part devoted to Pauli’s theory 
(Part III)]. If we denote the eigenvalues (real quantities) by P', 
the eigenvalues of T 2 will be 

T ,2 = c 2 (m 2 c 2 + P' 2 ) (11.11) 

so that 

T'= ± c (m 2 c 2 + P' 2 )' 1 ’ (11.12) 

and hence 

IT'! >mc 2 (11.13) 


Let us show that the theory does give two signs of the eigen¬ 
values of T. We write the eigenvalue equation 

Tty — cp a (<J X P X + <J y Py + a z P z ) ty + mc 2 p c ty = T'ty HI .14) 

If $ is a solution of this equation corresponding to eigenvalue T', 
then 

ty* = P b ty (11.15) 

will be the eigenfunction corresponding to eigenvalue — V. 
Indeed, in virtue of the fact that matrix pa commutes with matric¬ 
es Ox, Gy, Oz and anticommutes with p a and p c we find that 

Tty* — Tp b ty — — p b Tty= — p b T'ty — — T'p b ty 

that is, 

Tty* = -T'ty* (11.16) 

which proves our statement. 


12. The second intrinsic degree of freedom 
of the electron 

The fact that the theory predicts negative values for kinetic 
energy presents a substantial difficulty. This difficulty is connected 
with the above-mentioned paradox that the eigenvalues of the vel¬ 
ocity operators for an electron are ±c. Both corollaries are due 


20—2186 
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to a second degree of freedom of the electron, and this is de¬ 
scribed by the operators p a , p*, p c (we consider the first degree of 
freedom to be described by the spin vector with components 
Ox, o y . Ox). This second degree of freedom is of a relativistic na¬ 
ture. Apparently, its physical meaning is that the Dirac equation, 
in a sense, describes not only electrons but positrons as well (a 
positron is an elementary particle with electron mass and positive 
charge equal to that of the electron). 

If we are to interpret the Dirac equation in such a way, it is 
impossible to preserve the usual interpretation of the wave func¬ 
tion as describing a one-particle state. The usual interpretation 
can still be applied, in a sense, to quantities that are not con¬ 
nected with transitions from states with positive energies to states 
with negative energies (or with the inverse transitions). (This 
concerns first of all the corresponding elements of Heisenberg’s 
matrices.) If we want to isolate such quantities, we must build for 
each operator the corresponding Heisenberg matrix and then 
discard the matrix elements that correspond to transitions be¬ 
tween states with energies of opposite signs (that is, between 
energies of the order of -fmc 2 and — me 2 ). Since these elements 
contain rapidly oscillating factors e ±iat , where © is of the order 
of 2 mc 2 /h discarding these matrix elements is equivalent to aver¬ 
aging oyer a time interval that is much greater than 1/© but 
smaller tfian the reciprocal of the frequencies of ordinary transi¬ 
tions. Let us illustrate this by the following example. 

We construct Heisenberg’s matrices of the operators p„, p&, p e , 
which enter into the Hamiltonian 

H = cp a ( o x P x + o y Py + o z P z ) -f mc 2 p c - ed> (12.1) 

According to (4.8) these operators satisfy the following relation¬ 
ships: 

p2-i. p 2 *- 1 - p 2==1 

P»P fl = iPa. P C Pa = ip6> PaP» = *Pc (12.2) 

If we then use the notation 

P = a x P x 4- OyPy -f a z P 2 (12.3) 

we can write the Hamiltonian as 

H — cp a P -f- mc 2 p c — e<D (12.4) 

By using formula (10.8), that is, 

— — e {o x 8 x 4- OySy -f ojB z ) (12.5) 

we can maintain that, in the absence of an electric field, P is a 
constant of the motion. 
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In order to build Heisenberg’* matrices for the operators p a , 
p b, pc we construct the equations of motion for them. If for the 
sake of brevity we put 


we get 



( 12 . 6 ) 



dp b P 

— = «p a -co —p C) 




(12.7) 


The rest of this section is devoted to the case without an elec¬ 
tric field. Since here P is a constant of the motion, in Eqs. (12.7) 
we can consider P to Be this constant and not the operator (12.3). 
Then the equations will have constant coefficients and can be 
solved in a straightforward manner. 

Instead of p a , p&, p e we introduce three matrices 


T “ [1 + P 2 /(m 2 c 2 ) j' A ( Pa me Pc ) 
T 6 = Pfe 

Tc _ [1 + P 2 /(m 2 c 2 )]' l ‘ ("^7 Pa Pc ) 


( 12 . 8 ) 


These matrices satisfy the same relationships (12.2) as p a , p b , and 
p c , namely 

T a = 1. x| = 1, x\ = 1 
x b x c — ix a , x c x a = ix b , T a x 6 = ix c (12.9) 


We can express the p’s in terms of the t’s in the following way: 


Pa 


1 


[1 + P 2 /(mV)] 

= 


'!> ( Ta+ me Te ) 


Pc ~ [, + ( me Xa+ %c ) (12 ‘ 10) 


where the constant factors differ from the ones in (12.8) only in 
the sign of P. 

The equations of motion for the t’s are 


dX a 

dt 


— — VTj, 


dx b 

dt 


= VT„ 



( 12 . 11 ) 


where for the sake of brevity we have put 


20* 


v = 0>(l +w) ,/l = f-(mV + P 2 ) 1 


( 12 . 12 ) 
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We can now easily solve Eqs. (12.11) if we take into account 
conditions (12.9). We have 

x a — — Ti sin vt — t 2 cos vt 

x b = T t cos vt — t 2 sin vt 

x c = x 3 (12.13) 


where n, t 2 , T 3 are constant matrices that satisfy the conditions 


X 2 —• 1 —— 1 

l l 1, l 2 1, 



1 


t 2 t 3 = /ti, t 3 Ti = /t 2 , t 1 t 2 = it 3 (12.14) 


similar to (12.9). 

Let us apply the averaging process to the Heisenberg matrices 
just obtained. (We mentioned the need for this process at the 
beginning of this section.) We have 

<T a > = 0, (x b ) = 0 , (t c ) = x 3 (12.15) 


TTius the mean values of Heisenberg’s matrices for the operators 
p are 


<Pa> = 


( m 2 c 2 + P 2 )' 1 ' 


t 3 , <p 6 ) = 0, (p c ): 


me 


(m 2 c 2 + P 2 )-’ 


X 3 (12.16) 


Substituting these values into the expression for the Hamiltonian 
(12.4), we get 

H = ct 3 (m 2 c 2 + P 2 ) 1/j - e(D (12.17) 

If we remember that r 3 is a constant whose square is unity, the 
last formula is the same as that of the classical kinetic energy, 
(11.3). 


13. Second-order equations 

The Dirac equation is a system of four differential equations of 
the first order in four unknown functions. We can exclude two 
functions and construct a system of two second-order equations 
in two unknown functions. If we then perform a limiting process 
by tending c to infinity, we come to a nonrelativistic wave equa¬ 
tion for an electron in a magnetic field. This brings us to Pauli’s 
theory, which we considered in Part III. 

Since the derivation of the Pauli equation from the Dirac equa¬ 
tion is interesting in itself, we will do it here although we already 
know the result. 

In order to find a system of two second-order equations using 
the Dirac equation we write the latter as 

= ednj) + ih -|i- 


(13.1) 
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where T is the electron’s kinetic-energy operator considered in 
Section 11. Let us apply operator T to both sides of this equation. 
After some manipulations we get 

+ e (TQ> - (Dr)] 4 + ihe * 

+ e 2 O 2 v|)4-2/AeoU--/i 2 -0 (13.2) 


The expression in brackets on the right-hand side is, as can be 
seen, — ih times the total time derivative of operator T, which we 
have already calculated [see (11.5)]. We also calculated the expres¬ 
sion for P, but we must modify it. If we determine P 2 in the 
second term (11.9), we get on the basis of (9.11) and the proper¬ 
ties of matrices a the following: 

f = mV + c {P\ + P 2 y + Pi) + hec {o x M x + a y M u + o z 96 z ) (13.3) 


We use the expressions of type 

P\ — ( p x -f y A x y — p\ + — A x p x 


ihe dA x 
c dx 



and also for brevity the vector notations, and we get 

T* — mV + cV + 2 ec (A • p) — ihec div A + e?K 2 + hec (a- K) (13.4) 


We insert this expression into (13.2) and use (11.8) and the 
Lorentz condition 

divA + |^- = 0 (13.5) 

This yields 

[mV + c 2 p 2 + 2 ec (A • p) + e 2 (A 2 — O 2 ) + hec (a • 3 if) + ihe (x • S)\ ip 

= 2iheQ>2L-h 2 -^ (13.6) 


This equation can also be written as 

-Vt + ^^-f-[(A.gr.d*) + f4t] + 

+ -kt 0 V — W ♦ + -t («' *>■♦'+ TP (i' 


nfc * 
A* 


$ 

= 0 


(13.7) 


This is a system of four equations for the four functions ty, ip 2 , 
ip 3 , i|> 4 . With our choice of matrices a* the first two equations 
include only the first two functions t|>i and ip 2 , and the second two 
equations only t |>3 and $ 4 , so that Eq. (13.7) separates into two 
systems. 

Equation (13.7) differs from the relativistic generalization of 
the Schrodinger equation proposed by various authors before the 
concept of the electron spin was introduced and before Dirac’s 
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theory. The difference lies in the two last terms, which contain the 
matrices a and x. 

Let us see what equation we get from (13.6) or (13.7) if we 
omit the relativistic correction, that is, if we effect the limiting 
process c-> oo on the assumption that the energy of the particle 
is close to its rest energy me 2 . 

For this we put 

\\>z=^*e- imcU/h (13.8) 

and assume that ip* varies slowly in time in comparison with ip. 
For stationary states this assumption means that in 

,j ) = 1 |)0 e -iwtih (13.9) 

we have put 

W=mc 2 + E (13.10) 

and consider E to be very small in comparison with me 2 . If we 
insert (13.8) into Eq. (13.6) and divide the result by 2 me 2 , we get 
the exact equation 

[i ( p + 7 A ) 2 ” + llr (CT ' ** ~ ih ^dT 

= j^[~ ihe • *) + ( e ° + ih 4t) 2 ] +* ( 13 ‘ 1 

We must remember that when passing to the limit c-*-oo the 
factor 1/c in the terms containing the magnetic quantities A 
and W appears from the use of the Gaussian units, that is, the 
factor is a constant. Because of this we must retain all the terms 
on the left-hand side of (13.11), whereas we replace the right- 
hand side with zero. The approximate equation is then 

hHp+7 A ) ! -^+=<*-‘*>K=' /1 T < 13 - l2 > 

The operator on the left-hand side of (13.12) 

«' = i(p + 7 A ) ! - e<I> + ='<'>'» < l3 > 3 > 

is hermitian. If we take a to be the set of the 2 X 2 Pauli matri¬ 
ces, the Hamiltonian H* will coincide with the Pauli operator 
examined in Section 5, Part III, and the wave equation (13.12) 
will coincide with the Pauli equation. Introduction of 4 X 4 matri¬ 
ces makes no essential change because the equation for four 
component functions, (13.12), splits into two equivalent systems 
of equations for two component functions. 

The fact that the Pauli equation is obtained from the Dirac 
equation as an approximation is additional proof that it is valid. 
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In conclusion we write the wave equation (13.12) in explicit 
form. If by H° in accordance with (5.18) Part III, we denote 



(13.14) 

which does not contain matrices, we get 


H t = H 0 + V P(o-M) 

(13.15) 

where 


o eh 

»=%™ 

(13.16) 

is the Bohr magneton. 



If we assume that a are the 2X2 Pauli matrices introduced 
in Section 1, Part III, 


and is the two component wave function (the first two com¬ 
ponents of the four component Dirac function), we can with our 
choice of matrices a x ,a y ,a z [see (4.26)] write Eq. (13.12) as 

+ (* - iX t ) *3 -ih^j- = 0 

H% + H° [(*, + iX t ) - M&] - «■= 0 (13.18) 

The equations for the last two components (\j>* and i|>*) will be 
similar. They will differ from (13.18) in the signs of the terms 
that derive from ai and a 3 (in other words, the terms proportional 
to 36 x and 36 ’*) If we choose the matrices according to Dirac [see 
(4.18)], the equations for the last two components will be a 
simple repetition of the equations for the first two. 




Chapter II 


THE USE OF THE DIRAC EQUATION 
IN PHYSICAL PROBLEMS 


1. The free electron 

The wave equation for a free electron has the form 

0 (U) 

where according to (3.7) and (4.13), Chapter I, 

H — c (a\p x + a 2 p v + a 3 p z ) -f mc 2 a A (1,2) 

or 

H = cp a ( a x p x + o y p y + a z p z ) -f me 2 p 0 (1.3) 

Since the law of conservation of energy holds for a free electron, 
we can add to the wave equation the eigenvalue equation for the 
Hamiltonian 

H$ = W$ (1.4) 

Further, the operators p x , py , Pz commute with H and therefore 
are constants of the motion. Since they also commute with each 
other, we can consider the momentum components to be given 
numbers p' x , p' y , p' z and subject if to the additional conditions 

p u y = -ih^ = pfi 

p^=-.-ih^ = p' z ^ (1.5) 

Mathematically this means we have assumed the dependence of 
all four functions iju on the coordinates and time to be 

$ = exp [^- (xp' x + yp' y + zp' z — (1.6) 

that is, we are considering a plane wave. 

Another constant of the motion is the operator 

P = o x p x + <J y py + a z p z (1.7) 

which commutes with both H and p x ,p y ,p z . Hence we can sub¬ 
ject qj to yet another condition 

PiJ) = P' ij) 

312 


( 1 . 8 ) 
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We met operator P in Pauli’s theory, but there we had no occasion 
te calculate its eigenfunctions since the Pauli equation contains 
only its square, which in the absence of a field is 

p2 = P 2 x + P 2 y + Pl d.9) 

For this reason, when p' x , p' y , p z are given, eigenvalue P' can 
only have two values: 

P '= + (P'x +P'y+P?y'‘ and p, = -(P? + P'y 2 + P?y MU°) 

The Hamiltonian expressed in terms of P will have the form 

H = cp a P + mc 2 p c (1.11) 

Since 

H 2 — m 2 c 4 + c 2 P 2 (1.12) 

the eigenvalues W of H will be 

W = + (m 2 c 4 + c 2 P 2 )' k and r = - (m 2 c 4 + c 2 P 2 ) v * (1.13) 
Thus for a given value of momentum we have four solutions: 
first W = + \W\, P = + \P\ 
second W = + \W\, P = -\P\ 
third W = -\W\, P = + |P| 
fourth W = -\W\, P = -\P\ (1.14) 

The first two correspond to positive kinetic energy. Of these 
the first corresponds to the magnetic moment or the spin vector 
coinciding in direction with the direction of motion; the second 
corresponds to motion in the opposite direction. The second two 
correspond to negative energy and have no physical meaning in 
ordinary quantum mechanics, which deals only with fixed number 
of charged particles. In the general case we established the ex¬ 
istence of these solutions in Section 11, Chapter I. 

Now let us find the eigenfunctions that describe these four 
states. 

We have a set of simultaneous algebraic equations (1.4) and 
(1.8), which we write (ignoring the primes) as 

(v x Px + o y p y + o z p z ) ty = Pq (1.15) 

(cp a P + mc 2 p c )y = W$ (1.16) 

These serve to determine the four component function \j). But 
Eq. (1.15) also holds for the two component function of Pauli’s 
theory. 

If according to formulas in Section 1, Chapter III, we consider 
Ox, o y , o z to be the Pauli matrices 

O x — Oi, 0y — 02, <J * ==0 3 (1.17) 
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we can write (1.15) as 

(Px ~ ip y ) % + PAi — P^i 

(Px + ipy) — Pzty 2 = P^>2 (1.18) 

Owing to the relationship (1.9) the system determinant vanishes. 
We can put 

♦i = A, (P + p z ), = K (Px + iPy) (1-19) 

where X is constant. 

If, on the other hand, we view (1.15) as the equation for a four 
component function and, in accordance with formulas (4.26), 
Chapter I, take for o x , o y , o 2 the matrices 

o x = s u o y = s 2 , a z — s 3 (1.20) 

where 



0 

1 

0 

0 


0 

— i 

0 

0 


T 

0 

0 

0 


1. 

0 

0 

0 


i 

0 

0 

0 


0 

- 1 

0 

0 

s l — 

0 

0 

0 

— 1 

> s 2 — 

0 

0 

0 

— i 

t S 3 — 

0 

0 

- 1 

0 


0 

0 

- 1 

0 


0 

0 

i 

0 . 


0 

0 

0 

1 


( 1 . 21 ) 

then the equations for the functions and ij; 2 , (1.18), will keep 
their form but will be joined by two similar equations for func¬ 
tions i |)3 and \p 4 , namely 

— (Px + ipy) ^4 — Pz^3 — P ^3 

— (Px — ipy) % + Art 4 = ^4 (1 -22) 


We can write the solution to these equations as 

ta = V(Px + ip y ), ^4== — M {P + Pz) (1-23) 


Hence the solution of (1.15) for the four component function \|> 
will contain two arbitrary constants, k and p. Their ratio can be 
determined from (1.16). 

With our choice of matrices we have, according to (4.27), 
Chapter I, the following: 



1 

0 

0 

0 


0 

0 

0 

— i 


0 

0 

0 

-1 


0 

1 

0 

0 


0 

0 

i 

0 


0 

0 

1 

0 

Pa = 

0 

0 

-1 

0 

» Pft 

0 

— i 

0 

0 

• Pc 

0 

1 

0 

0 


0 

0 

0 

-1 


. i 

0 

0 

0 


-1 

0 

0 

0 


and Eq. (1.16) in expanded form will be 

cPty i — mc 2 ^ = U? ti 


(1.24) 


cPi|) 2 + mc 2 t 3 = 


— cP \|) 3 -j- mc 2 t |) 2 == 


— cPtyt — mc 2 ti — Wty 4 


( 1 . 25 ) 
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Expressing by (1.19) and (1.23) the components of the wave 
function in terms of X and p, we get two equations 

(IF - cP) X - mc 2 p = 0 

-mc 2 X-f(IF + cP)p = 0 (1.26) 


each of which appears two times in (1.25). These equations yield 


A, W + cP me 2 

p me 2 W — cP 


(1.27) 


It follows that the ratio X/p is real and its sign coincides with 
the sign of IF. Another corollary of (1.27) is 

X 2 + p 2 .... W X 2 -v 2 P n 9m 

2 Xp me 2 ’ 2 Xp me ^ ' 

Substituting the found values (1.19) and (1.23) of the compo¬ 
nents of the wave function into the expressions for the current 
density given in Section 8 , Chapter I, and using the relationship 
(1.28), we get 

4 = 4XpP (P + p z ), 4 = 0 (1.29) 


and for the space-time components of current density 


4 — 


W . Px . 

2 A *’ A ' me A4 ’ 


me 


4 = — 4 , 

2 me 1 


Pz 

me 


4(1.30) 


At IF > 0 we can normalize function tfi in a way such that 
4 = me 2 , and at IF < 0 so that 4 = —me 2 . Then at IF > 0 

4= IF, 4 = cp*, 4 = cp y , A- 6 = cp z (1.31) 
and at IF < 0 

4 = — IF, 4 = — cp x , A 2 = — cp y , A 3 — — cp z (1.32) 

Hence the spatial components of the current density are propor¬ 
tional to the momentum, and the ratios between them and the 
time component correspond to the ratio of the speed of particle 
to the speed of light. 

In conclusion we note that in the nonrelativistic limit, when IF 
is close to +mc 2 , the values of the constants X and p are close to 
each other and in consequence we arrive at the approximate 
equalities 

4 «— 4 , at IF > 0 (1.33) 

But if |p| < me and IF is negative, 

4 ~ 4. 4 » — 4 at IF < 0 


(1.34) 
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2. An electron in a homogeneous magnetic field 

We examined the influence of a homogeneous magnetic field on 
the energy levels of an electron in the nonrelativistic approxima¬ 
tion at the end of Chapter III, devoted to Pauli’s theory. Here we 
will consider a simple problem, assuming that no forces other 
than a homogeneous magnetic field act on the electron, but we 
will solve the problem on the basis of Dirac’s theory with no fur¬ 
ther omissions. 

We assume that the field is directed along the z axis and in 
absolute value is equal to \2B\. We can put the vector potential 
equal to 

A x = -\\X\y, A y = \\x\x, A, = 0 (2.1) 

which in cylindrical (polar) coordinates 

£ = pcosq>, t/==p sincp ( 2 . 2 ) 

correspond to 

Ap^I^Ip 2 ’ ^p = °> A z = ° (2-1*) 

We will formulate our problem in Cartesian coordinates and only 
at the end of the calculations shift to polar (non-Cartesian) coor¬ 
dinates. 

The Hamiltonian will have the form 

H = cp a (a x P x + OyPy + a z P z ) -f mc 2 p c (2.3) 

where 

P x = p x ~i;\M\y, Py = Py + i\%\*> P z = Pz M 

Here, as with the free electron, the operator 

P = a x P x + OyPy + a z P z (2.5) 

which enters into the expression for H will be a constant of the 
motion. Another constant of the motion will be operator p 2 , which 
commutes with P. 

We can therefore consider the complete system of equations 


H\ J) = Wty 

( 2 . 6 ) 

P$ = P'lp 

(2.7) 

T3 

-g- 

II 

■Gr 

( 2 . 8 ) 


As in the case of the free electron every eigenvalue of P will 
have corresponding to it two values of W\ 

W = ± (m 2 c 4 + c 2 P' 2 )' !l 


(2.9) 
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^The problem reduces to finding the eigenfunctions of P, that is, 
to solving Eq. (2.7). With our choice of matrices Eq. (2.7) can be 
written as 

{a x P x + a 2 P y + <t 3 P 2 ) i|) = P'$ (2.10) 

where o* are the 2X2 Pauli matrices. Once we find ifu and i |)2 
from these equations, we can obtain ij ) 3 and i |) 4 from (1.16). 

If we apply P to (2.10) and make use of the commutation re¬ 
lations between P x , P y , Pz [see Section 5, Part III, and Section 9, 
Chapter I, Part VI], we get 

( P 2 + P2 +P 2 + *l a3 |^|)^= P'2q ( 2 . 11 ) 

If we divide this equation by 2m, we get on the left-hand side an 

operator that enters into Pauli’s operator [see (5.19), Part III] 
and coincides with it when the electric field is zero. 

Now Eq. (2.11) has only one matrix, 03 , which moreover is 
diagonal. For this reason (2.11) splits into two equations, each 
of which has only one function ij>i. Bearing in mind (2.3) and (2.8), 
we can write them as 

[pI + p\ + ( xp y - yp x + h ) + (* 2 + y 2 }} *1 

= (P' 2 ~P?)% ( 2 - 12 ) 

[pI + pI + • £i r J - ( x Py - yp x ~ h )+ (* 2 + y 2 )] % 

= {P'-p' z )^ (2.12*) 

These equations differ only in the sign in front of the term con¬ 
taining fi. Let us express the operators p x and p y in terms of de¬ 
rivatives and put for brevity 



1 £'-*■ 4«( P ' ! P?) = l 

(2.13) 

We get 



d 2 i|>, 



dx 2 



= 26(2/- 1 )^, 

(2.14) 

1 

<N 

tN 

I 

TP?— ™ (t^-y&f + W+y 1 )^ 


dx 2 



— 2b (21 + 1 ) ij > 2 

(2.14*) 


In polar coordinates, (2.2), these equations take the form 


d 2 ifri 1 dty | , _1_ d 2 $ 1 

dp 2 p dp ' p 2 d<p 2 


+ 2 ib ^ + 2 b (21 - 1 )^ = 0 


d 2 qj 2 , J_ d\p 2 

dp 2 "1" p dp 


t (2-15) 

+-^4^+ 2ib ijr - + 2b ( 2/ +o **= 0 


(2.15*) 
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We could arrive at these equations more directly if we used the 
operator P transformed to cylindrical coordinates according to 
the formulas in Section 6, Part III. 

Equations (2.15) and (2.15*) are easily solved by separating 
the variables. In this case it is sufficient to consider the equation 
for one of the functions fi or 1 ) 32 , since owing to (2.7) these func¬ 
tions are connected in the following way: 

■fr~‘ ^- + b ( x - i y)^2 = i( p '~P' X )% (2.16) 

+ 1 -IT ~ b (x + M = T ( P ' + Pz) *2 ( 2 . 16 *) 

which are a generalization of (1.18) when there is a magnetic 
field. 

Equations (2.14) and (2 14*) are obtained from (2.16) and 
(2.16*) by excluding one of the functions, i|)i or tp 2 . 

We put 

t | h = Xe- im vf (2.17) 


where X is a constant factor, m is an integer, and f depends only 
on p, and we introduce a new independent variable 

S = V (2.18) 

The equation for f that follows from (2.15*) will be 

-|-(5|f) + (T + 'c)f = (' + i T 1 )f 


This equation differs only in notation from the equation for 
functions connected with the generalized Laguerre polynomials, 
which was considered in Chapter V, Part II, devoted to Schro- 
dinger’s theory [see Eq. (3.3*), Chapter V, Part II]. We saw that 
the eigenvalues of the operator on the left-hand side of (2.19) are 


1 + 


m + 1 


|m|+l 


+ p, p = 0,1,2, 


so that l will be a nonnegative integer: 

/ —0, 1, 2_ (2.20) 

With the value of / given, the number m will assume the values 

m — — l, — /-hi, 1, Z (2.21) 

At m > 0 we can take as eigenfunctions 

f, m (t) = e- tl2 t m,2 QT(t) ( 2 . 22 ) 

and at m < 0 


flm ® = (-1 f (S) 


( 2 . 22 *) 
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where Q are the generalized Laguerre polynomials, which we 
studied in detail in Section 4, Chapter V, Part II. With such a 
definition of fi m ( |) we can put 

= (2.23) 

Function \|u is expressed in terms of \j ) 2 by (2.16), which in po¬ 
lar coordinates can be written as 


— + (2-24) 

P — p z p \ d p ) 

Substituting (2.23) for \|) 2 , we find that 

e~ l (m+1) ^ [2| + (l - m) /] (2.25) 

On the basis of the properties of the polynomials Q s p (x) 


dQ s p (x) 
dx 


= - P Q s p t \(x) 


[(4.9), Chapter V, Part II] 


x + sQ s p (x) = (p + s) Q p ‘(x) 

[(4.12), Chapter V, Part II] 


it is easy to show that for both positive and negative values of m 
there is the relationship 

[24 -& 2 S- + (i - m) f im ] = - 2 lfi-i m+I ( 2 . 26 ) 

Substituting (2.26) into (2.25), we get 

'!>.=*■ -yiry 2/a-' <«+» m+1 (|) (2.27) 

For brevity we rewrite (2.27) and (2.23) as 

= * 2 = W» (2.28) 

where ^ and are the functions defined above. 

Equations (2.10) for the two component function are the first 
two equations in the set of equations for four component func¬ 
tions 

(siP x + s 2 Py + s 3 P z ) = P '\|3 (2.29) 

In (2.10) CTj, <x 2 , 03 are the 2X2 Pauli matrices, and in (2.29) 
Su s 2 , s 3 are the 4X4 matrices (1.21). The structure of these 
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4X4 matrices is such that the first two equations in (2.29) coin¬ 
cide, as we have just noted, with Eqs. (2.10), and the last two 
are obtained from them by replacing ifi with —ij) 4 and \|? 2 with \|) 3 . 
Hence, if functions (2.28) satisfy the first two equations in (2.29), 
then along with 

f 4 = — i|>° (2.30) 

they will satisfy all the equations in (2.29). This holds regardless 
of the form of operators P x , Py, P z , for one, in the case of the free 
electron considered in Section 1, and (1.19) and (1.23) corre¬ 
spond to (2.28) and (2.30). 

In view of this correspondence there is no need to repeat the 
manipulations of Section 1. We only need to remember (1.13), 
which connects the eigenvalues W and P'\ 

W = ± (mV + c 2 P' 2 )' k (2.31) 

and the expression that follows from (2.13) for P’\ 

P' = ± (p? -f- 4ft 2 W)‘/. (2.32) 

In conclusion we note that the dependence of the functions 
(2.23) and (2.27) on angle <p shows that they are the eigenfunc¬ 
tions of the operator 

^z = xpy — yp x + \a z = p v + Y a ^ ( 2 - 33 ) 

for the eigenvalue 

^ = + (2.34) 

Operator JK Z commutes with all three operators H, P and p z that 
enter into Eqs. (2.6), (2.7), and (2.8). This fact expresses the 
axial symmetry of the problem under consideration. 

3. Constants of the motion 

In the problem with spherical symmetry 

Let us examine the problem of describing electronic states in 
a field with spherical symmetry according to Dirac’s theory. We 
explored the same problem according to Schrodinger’s theory in 
Chapters IV and V, Part II. In addition, in Part III, devoted to 
Pauli’s theory, we studied the properties of the angular momen¬ 
tum of an electron possessing spin. Now we will see the distinc¬ 
tions that Dirac’s theory introduces; this theory explains the 
existence of doublets and gives a complete picture of the splitting 
of energy levels in a magnetic field. 
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It is convenient to follow the method used in classical mechan¬ 
ics, that is, to first consider the problem in rectangular Cartesian 
coordinates and only then go on to spherical coordinates. 

In rectangular coordinates the Hamiltonian for our problem has 
the form 

H = cp a (a x p x -f o y p y + a z p z ) -f mc 2 p c + U (r) (3.1) 

or 

H = cp a P + mc 2 p e + V (r) (3.2) 

where 

P = OxPx + OyPy + O z p z (3.3) 

is the operator introduced when we considered the Pauli equation 
[see (5.11), Part III]. The difference here is only that in Dirac’s 
theory the operators for the components of spin, o x , o y , a z , are 
represented by 4 X 4 matrices and in Pauli’s theory by 2 X 2 
matrices. 

Pauli’s theory introduces the operators 
^x = tn x + Y ko x 

y ' ' ttly 4 “ " 2 " flOy 

JL Z = m z 4- -i- ha z (3.4) 


for the components of total (orbital and spin) angular momentum. 
These operators satisfy the commutation relations 


jl y%^l Z “ I 


, = ihJC x 


Jl z Jl X " Jl X Jl z ~tft/Jly 

Jl X Jl y ”— Jl yjl X ” ihJl z 

The operator composed of the three components, namely 

1 


JC — a. 


r 4- O z 


( X 4“ OyM y | V Z *n Z g 

can be represented as 

Jl — o x m x 4- Oytriy 4- o z m z 4- A 


ft 


(3.5) 

(3.6) 

(3.7) 


We also see that the last operator commutes with each of the 
operators Jt x , Jft y , Jl z . In addition, as we saw in Section 5, 
Part III, the operator Jl anticommutes with P, which is deter¬ 
mined by (3.3): 

JIP + PJC^ 0 (3.8) 

The Hamiltonian (3.2) in Dirac’s theory includes operator P 
multiplied by matrix p a , and also two terms that commute with p c . 


21—2186 
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Since the matrices p„ and p c anticommute, it follows directly that 
the operator 

jft d = p c jK = Jfp c (3.9) 

will commute with p a P and hence with all the terms in H, so that 
we get 

HJ[ d - J( d H = 0 (3.10) 

and consequently 

-2T (•*!>)“ 0 (3.1 D 

Hence the quantity corresponding to operator Jt D will be constant 
for a field with spherical symmetry. We calculated the time de¬ 
rivative of this quantity for an arbitrary field in Section 10. Chap¬ 
ter I [formula ( 10 . 11 )]. 

We have seen that in the problem with spherical symmetry the 
three operators H , — pcJf, and commute with each other. 

For this reason we can consider the set of equations 

Hty == JCty 
Jl D i|) = kh \|> 

JC Z $ — (w* + y) tlty (3.12) 

The last two equations are closely connected with the equations 
for spherical harmonics with spin, which we examined in Part III. 

Here we have designated the integer proportional to an eigen¬ 
value of jKd by the same letter k as the integer proportional to an 
eigenvalue of J( in Pauli’s theory [formula (1.22), Part III]. This 
should not arouse confusion because in both cases k assumes the 
same values and the physical meaning of Jl and Jto in the cor¬ 
responding theories is similar. 

4. Generalized spherical harmonics 

To find the simultaneous eigenfunctions of the operators JC D — 
— p c jK and Jl z we must transform them to spherical coordinates. 
At the same time we will perform a canonical transformation of 
the four component wave function similar to the one that we used 
in Section 2, Part III, for the two component wave function of 
Pauli’s theory. 

We will denote the 4X4 matrices a x , a y , a z that correspond to 
our choice of the Dirac matrices as «i, s 2 , S 3 . 2 According to formu- 


1 There is no reason to fear confusion of these matrices with the operators 
s x , Sj,, s 2 , introduced in Part IV, for the components of the spin angular momen¬ 
tum of a system of electrons. 
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las (4.26), Chapter I, we have 



0 

1 

0 

0 


0 

—i 

0 

0 


1 

0 

0 

0 


1 

0 

0 

0 


i 

0 

0 

0 


0 

-1 

0 

0 

Si = 

0 

0 

0 

-1 

, s 3 — 

0 

0 

0 

—i 

» s 3 = 

0 

0 

-1 

0 


0 

0 

-1 

0 


0 

0 

i 

0 


0 

0 

0 

1 


(4.1) 


To express the operators Jl z and Jt in spherical coordinates we 
can use the formulas derived in Section 2, Part III, on the basis 
of Pauli’s theory, the only difference being that we must change 
the matrices <ji, 02 , 03 to Si, s 2 , s 3 . Let us write the most important 
of these formulas using the new notation. 

In the spherical coordinates r, 0, q>, which are connected with 
the rectangular coordinates x, y, z by the relationships 

x — r sin 0 cos <p, y = r sin 0 sin qp, z — r cos0 (4.2) 
the operators Jl z and Jl have the form 

•^ = P* + 4 S 3 < 4 - 3 > 

Jl — (— s t sin qp -f s 2 cos <p) p e 

+ (— Si cot 0 cos qp — s 2 cot 0 sin qp + s 3 ) p v 4* ft (4.4) 
where, as usual, p r , pe, P<p stand for the operators 

p r — lit -Qp, Pe = lh -qq , P<p = lit (4.5) 

(the operator p r does not enter into the expressions for Jl z 
and Jl). 

We perform the canonical transformation of the operators and 
functions according to formulas similar to (6.7) and ( 6 . 8 ), 
Part III, namely 


3?' = S2S + , 


(4.6) 

where 


S = cos y + is 3 sin , 

5 + — cos -| — is 3 sin 

(4.7) 

After transformation we have 



M’ z — 

' Pep 

(4.8) 

Jt' = — Si cot 0 Ap + s 2 (p 0 

— y cot 0 ) 4- S 3 p„ + y 

(4.9) 


Then we perform a transformation similar to (6.21), Part III, 
namely 

Jl" = TJC'T + , V' = Ttf (4.10) 


21* 
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where 

0 0 + 0 0 
f = COS y +/S 2 sin y , T = COS y — <S 2 sin y ( 4 . 11 ) 

As in Paulis theory we get 

= - inre ^ + s * (/»e - t- cot e ) < 4 -»2) 

Finally, putting 

JT = (sin e) v> jK" (sin 0 )~ % , rj>* = (sin 0 ) Vl rj>" ( 4 . 13 ) 

we have 

■*:=*>, ( 4 -< 4 ) 


According to (4.7) and (4.11) the matrices of the canonical 
transformation, S and T, contain the operators Si, s 2 , s 3 but do 
not contain p a , p*, p c . For this reason the appearance of the latter 
operators does not change in the transformation. Specifically, 
according to (4.27), Chapter I, we have 


Pc 




0 0 0 -In 
0 0 1 0 | 
0 10 0 I 
-1 0 0 0 / 


(4.15) 


After multiplying by p c the eigenvalue equation for JC D — p c JC 
we can write it as 

== khp c tf (4.16) 

The system of equations for the four components of i|)*, which 
corresponds to (4.16), will be 


“ 1Sr +3 “ = ~ 


(4.i7) 

If we express the operators p<p and pe in terms of derivatives and 
change the sign on both sides of some of these equations, we can 
write them as two like systems of equations for two functions 
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each, namely 


sine d<p <36 — 


1 ■ 
sin 0 (3q> 


d Q —k^3 


(4.18) 


i 0*2 

sin 6 <3qp 

l ^4 

sin 6 dq> 


W — 

d1j>! 

+ ~dQ~ — ~ ^2 


(4.18*) 


Equations (4.18) and (4.18*) differ only in the sign before k. 

We met these equations in Pauli’s theory when we examined 
spherical harmonics with spin [Section 3, Part III]. We wrote 
them in the form 

1 dZ dZ _ ,y 

sin 0 <3<p <30 


1 dY , dY _ , _ 
sin 6 dif "l* <30 ~ 


(4.19) 


The solutions of our equations (4.18) and (4.18*) will then be 

♦I = / (r) F (0, <p) 

$2 = g (') Z (9- <!>) 

,|>; = /(r)Z(0, q>) 

V<=~g(r)Y(Q, (p) (4.20) 

where the functions f(r) and g(r) no longer depend on 0 and <p. 
Their dependence on r is determined by the eigenvalue equation 
for the Hamiltonian. 


5. The radial equation 

Now let us turn to the Hamiltonian. After going over to sphe¬ 
rical coordinates the Hamiltonian can be written as 

HT = c 9a P* + mc 2 p c + U(r) (5.1) 

Operator P * for four component functions is obtained from the 
corresponding operator for two component functions by substitut¬ 
ing the 4X4 matrices sj, s 2 , s 3 for the 2X2 Pauli matrices 
01 , 02 . 0s- On the basis of (6.37), Part III, we have 


pm = -TPe + T^Qp<f + s 3 Pr 


( 5 . 2 ) 
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This operator is connected with the operator 

+ (5.3) 

which was examined in Section 4, by the same relationship as in 

Pauli’s theory [see (6.38), Part III], namely 

P'=s 3 (p r + i-%-) (5.4) 

We assume that the four component function ij>* is an eigenfunc¬ 
tion of the operator 

^ = (5.5) 

which (unlike Jl *) commutes with the Hamiltonian. For this rea¬ 
son we can use formula (4.16) and put 

Jl'tf — khp c it>* (5.6) 

In view of (5.4) we have 

P'tf=s 3 (p r + iy- p.)+* (5.7) 

and consequently 

Pa PV = S 3 (paPr + P*-7-)^ (5-8) 

Hence the eigenvalue equation for the Hamiltonian can be writ¬ 

ten as 

7/V = (cp fl s 3 p f + cp fc s 3 -y- + mc 2 p c + u (r)) 4* = tFi])* (5.9) 

This equation includes the matrices 

T 0 = PaS 3 , Tft^PiSa. T c = p c (5.10) 

which satisfy the same relationships as p a , pt, p c : 

T a T b = ix c , TftT c = /T a , X c X a = ix b (5.11) 


To simplify further calculations we write the matrices x a , Xb, x c 
in explicit form. We have 



1 

0 

0 

0 


0 

0 

0 

—i 


0 

0 

0 

-1 


0 

-1 

0 

0 


0 

0 

—i 

0 


0 

0 

1 

0 


0 

0 

1 

0 

. T 6 = 

0 

i 

0 

0 

. T e = 

0 

1 

0 

0 


0 

0 

0 

-1 


i 

0 

0 

0 


-1 

0 

0 

0 


(5.12) 


We rewrite Eq. (5.9) in the form 

H y = (ct fl p r + cx b -y- + me 2 x c + U (r)) ip* = Wtf (5.13) 
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Using the expressions (5.12) for the matrices x a , x s, x c , we can 
write Eq. (5.13) in the expanded form. After shifting the term 
with potential energy to the right side we get 

cp$\ — ic — mc% = (W — U) ii>; 

— cp 2 $J - ic -y-tf 3 + mc \= (W - U) 
cp.^l + ic -y- if* + mc% = (W — U) 

— cpX + ic - m4; — (W — U) (5.14) 

Now we substitute the from (4.20) and replace operator p r 
with the corresponding derivative. For the radial functions f(r) 
and g(r) we get the following set of equations: 

— ich-Q + ic -y- g + mc 2 g = (W — U) f 

Ich — ic — f + mc 2 f = (W — U)g (5.15) 

repeated twice. 

6. Comparison with the Schrodinger equation 


We can rid ourselves of the complex coefficients in (5.15) by 
putting 


whence 


f + s 
V2- 


= /., 


f~8 _ f 

*V 2 h 


/i +<?2 = V 2 /, h — if 2 — V2g 


( 6 . 1 ) 

( 6 . 2 ) 


Adding and subtracting the two equations in (5.15), we get 
for the new functions fi and f 2 a system of two equations of the 
first order with real coefficients: 


df t k ( _ - mc *-W+U f 
dr r he ' 2 

dft , k e — me 3 + W — U £ 
~dr r ~' 2 Tc '« 


(6.3) 


When the energy W is close to +mc 2 , the coefficient of f 2 in the 
first equation is much greater than the coefficient of fi in the 
second. For this reason f 2 is extremely small compared with f\: 




(6.4) 


Consequently, the functions f and g in (5.15) are nearly equal 
(and nearly real). To compare the system of equations (6.3) with 
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the Schrodinger equation we put 

W = mc 2 + E (6.5) 

and consider the values of El (me 2 ) and (E — U)/(mc 2 ) extremely 
small compared with unity. If we ignore them, we get 

k 2 me 

_U_fo-fo 

dr r 't fi '2 




E-U 

he 


/? 


( 6 . 6 ) 


where f\ and f% are approximate values of ft and ft. When we 
exclude f\ from these equations, we get the following equation 
for f\: 


h 2 d 2 f\ 
2 m dr 2 


h 2 (k- l)k 

"I 2m? 


f°i + Uf° { — Ef^ 


(6.7) 


If here we put 

f^ — rR (r) 

then the equation for R(r) 

d 2 R ,2 dR k(k-l) D , 2m 
d? * r dr r 2 * ‘ h 2 


[E-U(r)]R = 0 


( 6 . 8 ) 

(6.9) 


coincides with the Schrodinger equation for the radial function 
[see (3.16), Chapter IV, Part II], provided that the Schrodinger 
quantum number l is linked with our quantum number k by the 
relationship 

k(k- 1) = /(/+ 1) (6.10) 

which coincides with (3.17), Part III. Thus the number / intro¬ 
duced in Section 3, Part III (that is, the degree of the ordinary 
spherical harmonics in terms of which spherical harmonics with 
spin are expressed) is nothing but the azimuthal quantum number 
of Schrodinger’s theory. 

We can exclude / 2 from Eqs. (6.3) without any omissions. We 
then get 

d 2 f\ k(k — \) c , 2m ln n\ * 

-fir -?- h +-gr(.E — U)f i 

_ 1 dUfdh k t \ (E-U) 2 ' . . 

~ 2 me 2 + E-U dr \ dr r 'V ¥<? '* ' 6 - n ' 

On the right-hand side there are small terms that represent a 
correction for the theory of relativity and for spin. For the two 
values 

k = l -fl and k = -l (6.12) 

for which the left-hand side of (6.11) is the same, the values of 
this correction are different. The difference of the corrections for 
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the energy levels (the difference of the diagonal elements of the 
matrix for the correction terms) gives the approximate value of 
the distance between the spectral terms, namely 

oo 

AE = E(k)-E(-k+l) = 1 ^ r (2k-l)\ 7 (7?(r )] 2 dr (6.13) 

0 

where /J(r) is the solution to Eq. (6.7) with the following nor¬ 
malization: 

00 

5 [/? ( r )] 2 dr — 1 (6.14) 

0 

Here let us note one transformation of (6.11). If we put 

the equation for <p will be 

k (k ““ 1) I 2 fTl /n T T\ 

-^75—~ <P + ~gr {E — U)<q 

1 / 1 d 2 U . k dU \ _ (E-U) 2 _ 

— 2mc 2 + E-U V, 2 dr 2 ' r dr h 2 c 2 ^ 

4(2 me 2 + E-U) 3 {IT ) ‘P ( 6 ' 16 ^ 

This equation no longer contains the first derivative of the 
unknown function. If we assume that \r(dU/dr)\ me 2 , the last 
term on the right-hand of (6.16) can be dropped. In the first term 
on the right-hand side we can ignore E — U, which is much smal¬ 
ler than 2 me 2 . 

7. General investigation of the radial equations 

We want to study Eq. (6.3). These two equations have two sin¬ 
gular points: 

r — 0 and r — 00 

We begin by examining the region close to r = 0. Let us 
assume that for small values of r the potential energy U(r) can 
be expanded in a power series: 

U(r) = --^- + A' + A"r+ ... (7.1) 

where the expansion coefficient — A\ is equal, as we noted in 
Section 7, Chapter IV, Part II, to the product of the charge of the 
nucleus, Ze , by the electron charge, —e, so that Ai = Ze 2 . 
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In the coefficients of the right-hand sides of Eqs. (6.3) we drop 
all terms but those that become infinite at r — 0 and we get 

df 1 k p A I 1 p . 

dr r ' 1 He r ' 2 "* ' * * 

4t + 7 - (7.2) 

We assume that near r = 0 

fi — fli/* -+• Cir* +I + ... 

f2 = a2r B a2r t+l + ... (7.3) 

and we insert these expansions into Eq. (7.2). Equating the coef¬ 
ficients of r®- 1 , we get a system of homogeneous linear equations 

ai (e — k) + % = 0 

-4L ai + (e + k)a 2 = 0 (7.4) 

to determine ai and a 2 . These equations have a solution if the 
system determinant vanishes: 

e 2 & 2 + — 0 (7.5) 

From this for exponent e we get two values 

[ A 2 l 7, 

* 2 — ‘pprj — ± e o (7.6) 

where eo is a positive quantity: 

‘•=[* , - z> (-sr)T < 7 - 6, > 

We know that e 2 /hc is a dimensionless quantity approximately 
equal to 1/137. For this reason for all possible values of Z and k 
the quantity in parentheses will be positive. The constant 

2 = 4=137 <”> 

is called the fine-structure constant. 

Hence near r = 0 the general solution to Eqs. (6.3) has the 
form 

/, = ar-o(i + ...) + + ...] 

/ 2 = ar**[(eo'-f-jfe) + •••] + br *’( 1 + -.») 


(7.8) 
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For the functions f\ and f 2 to vanish at r = 0 the constant b 
must be zero. 

If k 2 = 1, then co = [1 — (Z/137) 2 ]' 1 ' < 1. For this reason, 
though fi and f 2 and, consequently, as well will vanish at r= 0 , 
the initial function t|j = sin' /a 0 ) will become (for |&| = 1 ) 
infinite as r -*• 0 and it will grow as 

r «.-l _ r ll-(Z/137)*]'/«-l g gj 

This constitutes a certain defect in the theory. It may arise from 
the fact that the Coulomb law of attraction cannot be extrapolated 
to distances so small that 


> me 8 (7.10) 

that is, 

r < Z X 3 X 10~ l3 cm (7.11) 

Now let us investigate the equations for large values of r. We 
assume as in Section 7, Chapter IV, Part II, that the potential 
energy far away from the nucleus is 

U(r) = -^ + £ + ... (7.12) 


We ask for the solution to (6.3) in the form 
f, == e for 8 + bjr 8 * 1 + ...) 
f 2 = e°'(a 2 r li + b 2 r*- l + ...) (7.13) 


We substitute these expansions into the equations and equalize 
the coefficients of the terms of order e ar r$ and e a '> 8_1 . We get 


aid + a 2 


me 2 + W 
he 


= 0 


me 2 — W . „ 

a,- Tc -M 2 a = 0 (7.14) 

L , L me 2 -f IP _ /o n A 

bia + b 2 — Yc — == — a i (P — k) — aa-^ 

&» 4 -b 2 d = ai ±- a 2 (p + k) (7.15) 


If we nullify the system determinant of (7.14), we get the follow¬ 
ing values for a: 


a=±j^(m 2 c*-W 2 )' u 


( 7 . 16 ) 
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The left-hand sides of (7.15) have the same coefficients as 
Eqs. (7.14). Since the determinant constructed from these coef¬ 
ficients is zero, we can exclude b\ and b 2 from (7.15) if we mul¬ 
tiply the first equation by — a and the second by (me 2 + W)/(bc) 
and add the results. This yields 

a x o.^-k) + c h o. irc +a l — Tc —p + *) = 0 

whence, expressing a 2 a and a 2 (mc 2 + W)/(bc) in terms of cq by 
means of Eqs. (7.14), we get after some manipulating 

2a,ap + 2a 1 -^-^ = 0 


For p this equation gives the value 

(7-17) 


We will not determine the constants b\ and b 2 . 

In conformity with the two signs of a the general integral will 
have the following form: 


A-*^*"* 0+4+-) 




or 


f 2 — — C iae ar r^ (1 + ~ + .••) 

+ C 2 ae -ar r -p (l +■£•+...) 
| W\> me 2 


If we assume that 


(7.18) 


(7.18*) 

(7.19) 


then a and p will be pure imaginary and the functions fi and f 2 
will remain finite as r-»-oo with any choice of the constants C\ 
and C 2 . But we can select these constants so that f\ and f 2 will 
vanish at r = 0. We can say, therefore, that the region (7.19) be¬ 
longs to the continuous spectrum. There can be no discrete spec¬ 
trum in this region because if a is pure imaginary the functions 
f i and f 2 are not square integrable. 

But if 

-mc 2 <W< + me 2 (7.20) 


then a will be real (we will consider it positive). For this rea¬ 
son f i and f 2 will either rapidly increase (if Cj 0) or rapidly 
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fall off (if Ci = 0) at infinity, so that in region (7.20) there can 
be no continuous spectrum. There is either a discrete spectrum or 
no eigenvalues at all. 

Finally, if 

W = ±mc 2 (7.21) 


then a vanishes and p becomes infinite, so that (7.18) and (7.18*) 
break down. We must look for the asymptotic solution of our 
equations in a form similar to (7.11), Chapter IV, Part II. If we 
put 

cto — (SmAfh 2 )' 1 ' (7.22) 


we get for the case when W — —me 2 by reasoning in a similar 
way 




r ha 0 
1 4 me 


e a.VT r -'/«_j_ ... _j_C 2 


ha 0 „-a 0 Vr~_-‘/i . 

l^F e r + 


f 2 = C,e a » W' + ... + Cie~ a ° Vrr v * + ... 


(7.23) 


and for the case when W = +mc 2 


f, = Cie'o* VF r v. + ... + C 2 e-'“*V7 r '/4 _|_ ... 

h = ~ C > S' e ' ao V7r "' /4 + • • • + C * -S' e ’ /a5 V}r ~' U + • • • (7-24) 


When at great distances from the nucleus there is only attrac¬ 
tion, A > 0 and ao is real. In this case the value W = + me 2 
belongs to the continuous energy spectrum and W = —me 2 does 
not. But in the case of repulsive forces A < 0 and ao is pure imag¬ 
inary; then W — —me 2 belongs to the continuous spectrum and 
W — +mc 2 does not. 

Hence we have established that in the case of attraction the 
continuous spectrum will be 


W < — me 2 , 

W > -f me 2 , A> 0 

(7.25) 

and in the case of repulsion 



W<-me 2 . 

W > + rnc 2 , A < 0 

(7.25*) 

whereas a discrete spectrum 

is possible only if 


1 

W | < me 2 

(7.26) 


From Eqs. (6.3) for radial functions we can derive some gen¬ 
eral corollaries regarding the distribution of the energy levels in 
the discrete spectrum. If we multiply the first equation in (6.3) 
by fi and the second by fi and add the results, we get 

if (fif 2 ) = ~Tc K ™ 2 + W-U)fl + (i me 2 -W + U) fU ( 7 . 27 ) 
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By integrating this expression from 0 to oo and bearing in mind 
the behaviour of the functions of the discrete spectrum we get 
zero on the left, and the right-hand side yields 

U (f? - /I) dr = - J [(me 2 + W ) f\ + (me 2 - W) fO dr (7.28) 
o 

But we know that in the discrete spectrum W lies between —me 2 
and + mc 2 . Whence the right-hand side is negative, and we have 
the inequality 

oo 

\u(fi-fl)dr <0 (7.29) 

0 

It follows from this that at a negative U (attraction) f\ is on 
the average greater than f\. As we saw in (6.4), this is the case 
when W is close to + mc 2 , that is, when W > 0. Consequently, in 
the case of attraction there are no negative energy levels belong¬ 
ing to the discrete spectrum. 

But in the case of repulsion (U > 0) there are no positive ener¬ 
gy levels but there may be negative levels. These negative levels 
can have no direct physical meaning. (The same was true of the 
states with negative kinetic energy mentioned in Section 12, 
Chapter I.) 

8 Quantum numbers 

According to the results of our investigation the stationary 
state of an electron in a central field can be characterized by the 
energy parameter and the quantum numbers k and m, the first of 
whicn is connected with the total angular momentum and the sec¬ 
ond with its projection on the z axis. For the discrete spectrum 
the energy W will depend on a third (principal or radial) quan¬ 
tum number, which is introduced when solving the radial equa¬ 
tion, and also on the number k, which enters this equation as a 
parameter. Hence here, as in Schrodinger’s theory, the electronic 
state in the discrete spectrum is described by three quantum num¬ 
bers, and the energy depends on two of them. 

As we found in Section 6, the leading terms of the second- 
order equation analogous to the Schrodinger equation contain 
the quadratic expression k(k —1), whereas the number k alone 
enters only into the correction term. The second-order equation 
contains k (k— 1) in the same way that the SchrSdinger equation 
contains /(/ + 1). Hence we can put 

k(k -!) = /(/+!) 


S 


( 8 . 1 ) 
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whence 




( 8 . 2 ) 


For this reason the two energy levels that correspond to the 
same principal quantum number n and to the same l (or \k — 1 / 2 1) 
but to two different values of k: 

k = l+ 1, k = — l (8.3) 


are quite close to each other and form a doublet. The exception is 
when l = 0. Since k cannot take on a zero value, only one level 
remains, and that is with k — +1. 

The doublet separation was calculated in Section 6 [see (6.13)]. 

Thus Dirac’s theory gives us what is required by experiment — 
the doubling (compared with Schrodinger’s theory) of energy le¬ 
vels; and the level with / = 0 is nondegenerate as experiments 
require. This doubling expresses one of the two additional (intrin¬ 
sic) degrees of freedom of the electron, which were mentioned in 
Section 12, Chapter I. 

It is customary to distinguish between the two levels of a dou¬ 
blet by the values of a new quantum number, which we denote 
by /. The quantum number /, like l, is uniquely determined by k, 
namely 

/ — I I — Y (8.4) 


Hence / can take on positive values that are equal to an integer 
and a half. 

Since the number of values of the magnetic quantum number m 
for a given k is 2\k\, the value of j gives the multiplicity of the 
level, which is 

2|fc| = 2/+l (8.5) 

It follows from a comparison of (8.4) with (8.2) that / differs 
from / by ±'/ 2 . namely 

1 = 1 +Y at k>0 

i = l-j at k<0 (8.6) 

Going by this formula, if we know / and /, we can find the sign 
of k and hence k itself. 

Spectroscopy usually denotes spectral terms with different val¬ 
ues of /, that is, / == 0, 1, 2, .... by the letters S, P, D, ..., and 
the value of j is given as the lower index to these letters. We can 
relate different quantum numbers to different spectral terms in 
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the form of the following table: 


= +1, 

1 = 0, 

i — 72. 

s 

= -1, 

1=1, 

/ = % 

p% 

== + 2, 

1=1, 

/ = % 

p>i. 

= — 2, 

1 = 2, 

i — 3 k, 

D>i, 

= + 3, 

1 = 2, 

i — 5 /2> 

D*i, 


It is the selection rule that determines between which spectral 
terms transitions are possible. 

9. Heisenberg’s matrices and the selection rule 

We will denote the wave function corresponding to the quantum 
numbers n, k, m (where n is the principal quantum number) by \j> 
or another letter without a prime, and the wave function with the 
quantum numbers n', k', m' by \J/ or the alternative letter with 
a prime (here we discard the asterisk, which in Section 4 was 
used to distinguish wave functions in spherical coordinates). 

If the wave function is normalized so that 

J dr dQ d(f = 1 (9.1) 

any element of Heisenberg’s matrix for any one of the coordi¬ 
nates, say, x, will be 

(nkm | x | n'k'tn') = ^ S S ^ ^ 

“ S S S + toto + iM>3 + toto )drdQd<p (9.2) 

If we insert in (9.1) the expressions for x|?x, \p 2 , to. to, (4.20), the 
normalization condition will be 

5 5 5 (ff + gg) (YY + ZZ) dr do d<p = 1 (9.3) 

This holds if 

oo 

5 (Ff + gg) dr = 1 (9.4) 

0 

and 

n 2n 

$ $ (YY + ZZ)dQdi p=l 

6—0 < p —0 


(9.5) 
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Now if we insert (4.20) into (9.2), we get 

(nkm | x | n'k'nt') = $ $ $ jc iff + gg') (YY' + ZZ') dr dQ dtp (9.6) 

Similar expressions can be written for the other coordinates, y 
and z. 

As in Schrodinger’s theory, the triple integrals of type (9.6) 
separate into the products of simple integrals. And if we put 

oo 

r D (nk; n'k') = jj r (ff' + gg') dr (9.7) 

o 

then the elements of Heisenberg’s matrices for the coordinates 
x, y, z will be equal to the product of (9.7) multiplied respectively 
by 

(km | sin 0 cos tp | k'm') — ^ ^ sin 0 cos <p (YY' + ZZ') dQ dtp (9.8) 

(km I sin 0 sin <p J k'm') — ^ ^ sin 0 sin <p (YY' -f ZZ') dQ dtp (9.9) 

(km | cos 01 k'm') = $ $ cos 0 (YY' + ZZ') dQ dtp (9.10) 

To evaluate these integrals we express Y and Z in terms of y t 
and y 2 by means of (3.4) and (3.13), Part III. We get 

YY' + ZZ' = e< <"*'-«)* (y x y[ + y 2 y' 2 ) sin 0 (9.11) 

We first evaluate the integral (9.10). It will obviously differ 
from zero only if m' — m\ in this case it will be 

Jl 

(km [ cos 0 | k'm) = y ^ cos 0 (y x y[ -f- sin 0 dQ (9.12) 

o 


or, if we use the notations of Section 4, Part III, 

Jl 

(km [ cos 0 | k'm) — ^ cos 0 y (k, m, 0) y (k', m, 0) sin 0 dQ (9.12") 

0 


If we now express the product y(k, m,Q) cosQ in accordance with 
(4.9), Part III, 


y(k, m, 0) cos 0 

- *. «■ 6) + "* -4 - n -— «/(* + !. rn, 6) 

+ " t + ”^:n~ (9.13) 


22—2186 
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then on the basis of the orthogonality of functions y we conclude 
that the integral (9.12) is nonzero in only three cases: 

k' = -k, k' = k+l, k' = k-l (9.14) 

In these cases it is equal to the respective coefficients in (9.13), 
namely 

(km | cos 01 — k m) — — (9.15) 

(km\ cos 01 k + 1 m) — + ( 9 - 15 *) 

(km 1 cos 91 k — 1 m) = I( * + )] ' - (9.15") 

We compute the first two integrals, (9.8) and (9.9), in a similar 
way. For this it is convenient to do as we did in Section 9, Chap¬ 
ter IV, Part II, that is, construct their linear combination 

(km\ sin 0 e /,p | k'm') 

= IF S S e< (m ' -m+I> v sin 2 8 y (k, m, Q)y(k',m', 0')d0df (9.16) 

which, obviously, will differ from zero only if m' = m — 1. With 
this condition it is 

km | sin 0 e <<p | k'm — 1 ) 

n 

— sin 2 0^ (A, m,Q)y(k', m— 1 , 0 ) dB (9.17) 

0 

If we now express the product y(k',m — 1,0) sin 0 by means of 
(4.10), Part III, 

y(k\ /n — 1 , 0 ) sin 0 = 2 y (~ k '> "». 0 ) 


[(k'-m-DW-m)] 

2 h' _i y (k *, tn, o) 


+ 


[( k' + m+l)(k' + m)]' h 
2A'+ 1 


y(k'+\,m,Q) (9.18) 


we find that the integral in (9.17) does not vanish in only three 
cases (9.14), when it is 

(— k'm \ sin0e ,< *’| k'm — 1) = 2 \ * 

(k' —l m \ sin Qe i( f\k'm— 1 )= \(k' -m-\)(V-m)]'i' 

(k'+\m\ sin0| k'm - 1) = t( * / 1 m+ ^ 


(9.19) 
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or, if we express k' in terms of k, 

(km\ sin 0 e ,(1> | — km — 1 ) = 2 — 

(km | sin 9 e* \ k + 1 m - 1) = - t(fe ~ w) ~ ” + 1)1 '* ■ 

(km | sin 9 1 k - 1 m - 1) = t( * + m) £ + ” ~ 01 ' - (9.19*) 

From this we get the matrix elements (9.8) and (9.9) according 
to formulas similar to (9.23) and (9.24), Chapter IV, Part II. We 
write these elements in the form of a table: 

(km | sin 8 cos q> | k'm') (km | sin 0 sin q> | k'm') 



These results reflect the selection rule that determines between 
which terms transitions are possible or impossible. 

The selection rule with respect to quantum number m will be 
the same as in Schrodinger’s theory; namely, for coordinate z 
(light polarized along the z axis) 

m' — m (9.20) 

and for coordinates x and y (light polarized on the xy plane) 

m' = m ±1 (9.21) 

Levels that differ from each other by the value of the quantum 
number m can be distinguished only in a magnetic field directed 
along the z axis. It is not surprising then that the z axis plays 


22* 
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such a special role in the selection rule with respect to m — the 
direction of this axis is physically selected by the direction of the 
magnetic field. 

The selection rule with respect to k is 

k' = -k, k' = k+l, k' = k-\ (9.22) 

This rule states that quantum number / — \k — V2I —V2 always 
changes by unity, as in Schrodinger’s theory. However, not all 
transitions of the type /' = / ± 1 are possible. There is a second 
condition for quantum number /: it must either remain unchanged 
or change only by unity. For example, a transition is possible 
between the A/ a and £)■/» terms but not between the Pi/., and £)■/» 
terms. 

10. Alternative derivation of the selection rule 

In view of the importance of the selection rule we will note 
another derivation, less elementary but not requiring a knowledge 
of spherical harmonics. The idea of this derivation belongs to 
Dirac. 

Consider the operator 

Jt z = m z + ^ho z 

with eigenvalues 

(m + y)ft 

The matrix representing this operator will be diagonal in quan¬ 
tum number m. If we write only this quantum number and assume 
the rest, we get 

(m\J[ z \m') = (m +y) hb mm > 

Now we consider the matrix for coordinate z with elements 

(m 1 2 1 m') 

From the commutation relation 

jHjc — zJft z = 0 ( 10 . 1 ) 

we get the following relationship between the matrix elements: 

(m -f y) ft (m | z | m') — (m | z | m') {m' + y) ft = 0 (10.2) 

or 

(m — m') (m 1 2 1 m') = 0 (10.2*) 

Consequently, the only matrix elements of z that differ from zero 
are those for which m' = m. As we know, this reflects the se¬ 
lection rule for z with respect to quantum number m. 
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Now we note that the selection rules for x, y, z are the same as 
for x — cp a Ox, y — cp a o y , z = cp a o z . This is why instead of x, y, z 
we can deal with the matrices 

a l ~ P aPxt ®2 = Pa a y> a 3 ~ P a a z 

which in some cases is simpler. For instance, from 

J[ z a 3 — a 3 Jl z = 0 
or 

Jl z i-iJ[ z = 0 (10.3) 

it follows that 

(m — m')(m\i\ m') — 0 (10.4) 

which is what we obtained before. 

We will derive the selection rule for x and y or for x and y, 
which is the same. We have 

JC z x — xJL z — cp a ( Jljs x — a x JL2) = ihcp a o y 
or 

Jtjc — xJK z — ihy j, (10.5) 

and in the same way 

»£zU y^z CPa “ Oyt^2 ;) s== MICPqO x 

or 

■£ z y — y^ z —~ MX (io.6) 

If we multiply (10.6) by i and add the product to (10.5), we get 

Jtz (X + iy) -(X + iy) J( z = h(x + iy) (10.7) 

Transferring to matrix elements we get 

(m — m' — l) (m \ x ly \ m') = 0 (10.8) 

which is the result that can be obtained from (9.16). 

In a similar manner we obtain 

(m — m! + l)(m|jc — iy\m') = 0 (10.9) 

Whence the condition that the matrix elements for x and for y do 
not vanish is 

m' = m±l ( 10 . 10 ) 

The way we proceeded can be somewhat altered. It follows 
from (10.5) and (10.6) that 

JC\x - 'UtxXjflz + xJfl - ft 2 * = 0 (10.11) 
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Transferring to matrix elements we get 

[m + y) 2 (tn\ x\m') — 2 (m + y) (m\x\tn') (m'-f 

+ (m | x' | m') (m' -f y) 2 — (m | x \m') = 0 
or 

[(m — m') 2 — 1 ] (m | x | m') = 0 (10.12) 


from which we get the previous result (10.10). 

We now seek the selection rule with respect to quantum num¬ 
ber k. The quantity kh is the eigenvalue of the operator 

A D = p c A (10.13) 

where 

Jl — a x tn x -\-dym y -\-a z m z -\-h (10.14) 


We considered Ad in Section 3 (it is the generalization of the cor¬ 
responding operator in Pauli’s theory). By formula (1.18), 


Part III, operator A satisfies the relationship 

A 2 = hA + (m 2 x + ml + ml) (10.15) 

Let us consider the operator 

& = A D {A\z - zA 2 D ) - (J&z - zAl) A D (10.16) 
In view of (10.13) and the equality 

A 2 D = A 2 (10.17) 

we can write 3? in the following form: 

S£ = (p C A) (A 2 z - zA 2 ) - {JPz - zA 2 ) (p C A) (10.18) 


Since matrix p c commutes with A and anticommutes with z = 
= cpaOz, we can write S’ as 

2 = p c [A ( A 2 z - zJC 2 ) -f ( A 2 z - zA 2 ) A] (10.19) 

But owing to (10.15) 

A 2 z-zA 2 = h(Az-zA) (10.20) 

which yields 

2’ = hp c [A(Az-zA) + {Az-zA)A] (10.19*) 

instead of (10.19). The terms of type AzA cancel out and we get 
3? = hp c (A 2 z-zA 2 ) (10.21) 

After applying (10.20) once more we arrive at 

2? = h 2 p c ( Az— zA) 


( 10 . 22 ) 
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If we return to operator Ad of Dirac’s theory and take into 
account that z and p e anticommute, we can write 

2’ = h 2 (A D z + zA D ) (10.23) 

By equating the initial and final expressions (10.16) and (10.23) 
for &, we come to the equality 

A\>z — Ad^A% — A\>zAd -f- zJC d — 1 1* (Adz -j- zAd) = 0 (10.24) 

Now we transfer from operators to matrices and use the repre¬ 
sentation in which JCd is diagonal. The matrix element for each 
term in (10.24) is obtained from the matrix element (k\z\k r ) by 
multiplying the latter into the eigenvalues hk and hk\ the eigen¬ 
values of Ad, in the proper degree (into hk if Ad is to the left 
of 2 and by hk' if Ad is to the right of z). Dividing out h 3 , we get 

(k 3 - kk' 2 - k 2 k r + k' 3 — k — k') (k\z\ k') = 0 (10.25) 

or 

{k + k')(k-k'~ l)(k-k'+ \)(k\z\k') = 0 (10.25*) 

From this follows the selection rule with respect to k: 

k' == — k, k' — k + \, k' = k-l (10.26) 
which we have already derived in a different way. 


11. The hydrogen atom. Radial functions 
For the hydrogen atom, for which the potential energy is 

t/(r)=-4 (11-1) 


radial equations (6.3) permit an exact solution. In the given case 
these equations have the form 


7(-»«*-*-f)& 
J W + Tf‘ = w{-"*’ + V +T-)f‘ 


( 11 . 2 ) 


We will confine ourselves to examining the discrete spectrum, 
that is, when W 2 < m 2 c 4 . We put 


a 


(m?c* - W*)' ! ' 
he 


(11.3) 


and consider a positive. Having in mind the asymptotic formu¬ 
las (7.18), we introduce a new independent variable 

x — 2 ar 


(11-4) 



344 


Fundamentals of Quantum Mechanics 


We also put 

UP = me 2 cos e, a = sin e, 0 < e < n 
Finally, we recall the value of the fine-structure constant 

Y = 17 = W 

After we change variables Eqs. ( 11 . 2 ) appear as 



7 C0t T-7 
Y tan J + j)fi 


(11.5) 

( 11 - 6 ) 


(H.7) 


Angle e is a parameter here. We must determine it in such a way 
that Eqs. (11.7) have solutions that are finite and continuous 
throughout the interval 0 < x < oo and that vanish at x = 0 
and x == oo. 

Let us introduce two new functions, F and G, such that 

f F — G * F + G ... -v 

1 1 “ 2 sin (e/2) ’ /2 “ 2 cos (e/2) K ) 

The new functions are expressed in terms of fi and f 2 in the fol¬ 
lowing way: 

F(x) = fi sin|-4-/ 2 cos-|- 


G(x) = — fi sin + ^ 2 cos y (11.9) 


Multiplying the first equation in (11.7) by ±sin (e/2), the second 
by cos (e/ 2 ), and adding, we get 



—+ 
dx ^ 

k g— i f + y 

x 2 ' isine 

(F cos e — G) 



dG , 

17 + 

* F— 1 G4- Y (F 

x r 2 u 1 x sin e ' 

— Q cos e) 

(11.10) 

We can exclude G 

or F from these equations. The result will be 

3 d*F . 
*d? + X 

£+[- 

j[X 2 + x (ycote-f ^-) 

-* 2 +y 2 ]f= 

0 (11.11) 

or 





, d?G , 

X d? + X 

§+[- 

|-x 2 + x(YCote — y) 

— a 2 + y 2 ]g= 

0 (11.11*) 


These equations are of the same type as 


d 


( dy\ 


“ 2 \ I 


X 


5 + 1 
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which we considered in detail in the chapter devoted to a nonrel- 
ativistic electron in a Coulomb field (Sections 3, 4, and 5, Chap¬ 
ter V, Part II). To make Eq. (11.11) for F coincide with (11.12) 
for y it is sufficient to put 

s = 2(k i -y*) ,, ‘, p + -i = YCote (11.13) 

For (11.11*) parameter s will have the same value and p will 
be less by one unit. Hence the eigenvalues are 

V cot e = p + (k* - v 2 ) V, » P — 0, 1. 2, ... (11.14) 

and the eigenfunctions are 

F(x)^Cx sn e- xn Q? p {x) (11.15) 

G (x) = C (x) (11.15*) 

Since F and G are linked by the system of equations (11.10), the 
ratio of the constants, C/C', will be a definite number. 

Solving the first of the equations in (11.10) with respect to G, 
we get 

G ^ = k + y/sine [ X 7x+ (^ + 2"“"2 ) ^] 

“T+^HT T **V-(*S-«£) (■>•'«> 

But in view of the property of polynomial Ql derived earlier we 
have 

dO* 

pQp- x -jr= = p(p + s ) QJ-i 

[(4.13*), Chapter V, Part II] 

It also follows from (11.13) that 

P<P + s)—S7-V 01 - 17 ) 

For this reason 

G(x) = C - k) x° n e~ xn QU (x) (11.18) 

Hence we have expressed the constants in (11.15*), C', in terms 
of C. 

To determine C we turn to the normalization condition. As we 
did in Schrodinger’s theory, we introduce the atomic unit of 
length 



(11.19) 
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and denote the distance from the nucleus in atomic units by 

( 11 . 20 ) 

Constant a introduced by (11.3), will be 

a —- Yc —~ = sine/(ay) (11.21) 

so that variable x is connected with r x by the relationship 

x = 2ar = 2-^i.r, (11.22) 

We take 

oo 

+ = 1 (H-23) 

o 

as the normalization condition, which also means that 

oo 

+ (1L23 * ) 

0 

If we express fi and f 2 in terms of F and G, we get 

n+n=-^T( F2 + G2 - 2FGc °^) (n-24) 

Substituting (11.24) into (11.23*) and remembering that F 

and G are orthogonal to each other, we can write the normaliza¬ 
tion condition as follows: 


w 

\(F 2 +G 2 )dx- 


2 sin s e 


From this we can find the normalization constant C: 

1 sin 4 e 


C 2 — 


or by (11.17) 


(p —1)!T (p + s)(—A+ y/sine) y* 


c 2 


___ k + y/sin e sin 4 e 
pir(p + s+ 1) v 2 


From this, if we introduce 

Qp M = [ P ir(p + s+ ^^ 

we get 

o W 1 -^ (liJl - *) Vs ^"^-1 w 


(11.23**) 

(11.25) 

(11.25*) 


(11.26) 

(11.26*) 
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It is convenient to denote y/sine by ft*: 


ft ==■ 


sin e 


(11.27) 

(11.28) 


Its square is 

n* 2 = p 2 + 2p(£ 2 -Y 2 )' /, + fc 2 
so that ft* differs only slightly from the integer 

n = p +1 A: | (11.29) 

which can be interpreted as the principal quantum number. 

By inserting n* into (11.26) and (11.26*) we can write the two 


f=jtK+ *)■'• ( 

’ 2r | \ 

. n' ) 

(11.30) 



(11.30*) 


For a given value of the principal quantum number n the num¬ 
ber k can take on the following sequence: 

— ft+1, — ft+ 2, — 1, +1, .... «—1, ft (11.31) 

in all 2ft — 1 values. The number k cannot be equal to — n since 
then the lower index of Qp S _i in (11.30) would become negative. 
But the value k — +n is possible because in this case from 
(11.29) we get p — 0, and from (11.28) we get ft* = k, so that 
the factor (ft* — k)' 1 * of Qp S _i vanishes. 

The number p is closely connected with the radial quantum 
number n r of Schrodinger’s theory. Namely, because of 

n — n r + l-\-\—p-\-\k\ 
and the relationship between / and k we have 
p — n r at k > 0 
/> = ft r +1 at k<0 (11.32) 

Thus, we have found the eigenfunctions corresponding to the 
discrete spectrum. It presents no difficulty to solve our equations 
for the value W = + me 2 , which corresponds to the boundary be¬ 
tween the discrete and continuous spectra, and for the continuous 
spectrum. But we will not do so. 


12. Fine-structure levels of hydrogen 


Now we will express energy in terms of quantum numbers. 
According to (11.5) and (11.14), 


W = me 2 cos e 


me 3 [ p + ( k 3 - V »)'/»] 
{[p + (k 3 - Y 2 )'/’] 2 + Y 2 } 7 * 


( 12 . 1 ) 
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or by means of (11.27) 

r = m C 2 (l--£r) V ’ (12.1*) 

Formula (12.1) is called the Sommerfeld formula. 

As we noted at the end of Section 7, Dirac’s theory gives only 
positive levels/ The lowest level (the ground state of hydrogen) 
corresponds to quantum numbers k = +1, p = 0, (n = 1), and 
is 

B7 0 = me 2 (1 — Y 2 )' 7 * (12.2) 

The entire discrete spectrum is located in the interval 

mc?{\ — v 2 )' 7 ’ < W < me 2 (12.3) 

whereas in the interval 

- me 2 < W < me 2 (1 - Y 2 )' 7 * (12.4) 

there are no energy eigenvalues at all. 

To compare the Sommerfeld formula with the Bohr formula, 
which we derived in Schrddinger’s theory, we will go on to approx¬ 
imate formulas. By approximately extracting the square root in 
(12.1*) we get 

(12.5) 


But by (11.6) and (11.19) 


me 2 y 2 ■ 


e* 

a 


( 12 . 6 ) 


whereas by (11.28) and (11.29) 

n'* — n 2 + 2(n — \k\) [( k 2 - y 2 )' 7 * — I k | ] (12.7) 

Using these expressions, we get, within terms of the order of 
me 2 y , 

b'=^ - s?-lr < 12 - 8 > 


The first term is a constant (the relativistic rest energy), the 
second is the (nonrelativistic) formula for the Bohr levels of 
hydrogen, and the third term provides a relativistic correction to 
it. This correction depends upon k as well as upon the principal 
quantum number n. For this reason the energy levels of hydrogen, 
which in Schrodinger’s theory depended only on n and did not 
change when the azimuthal quantum number l changed, split into 
several levels lying close to each other. These are obtained if we 
give the number k in (12.8) all permissible values (11.31). The 
result is the fine structure of the hydrogen spectral lines observed 
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in experiments. We note that the energy levels depend only on 
the absolute value of k (that is, on / but not on /), so that, say, 
the spectral terms Py 2 and D 3/2 for hydrogen coincide. 

Next we will find the doublet separation in the central-field 
problem (the doublet in alkali metals), that is, the value 


AWP = W (n, k) - W (n, - k + 1) 

(12.9) 

If we consider k > 0 and put k = / + 1, the formula (12.8) gives 

A UP — 

2anH (l + 1) 

(12.10) 

We computed the same quantity for the general case of a cen¬ 
tral field in Section 6 [see formula (6.13)]. Let us apply that for¬ 
mula to hydrogen. We have 

A£ =w< 2 '+ l >S4R< r >] 8 ‘ ,r 

0 

(12.11) 

Here we must put 


[/?(r)r dr = «('*)*. 

so that 

(12.12) 


(12.13) 


Using the expression (6.11), Chapter V, Part II, for R n i{ri) 
and introducing the integration variable x = 2rj/n, we get 

A£ = ^ I [e ,, ( W]S dx ([2 I4) 

0 


We denote the integral by I. If we put n — l — 1 p and 
21 -f-1 = s, we can write the integral as 

oo 

I = \ x*~ 2 e~* lQ p 3 (*)] 2 dx (12.15) 

0 


This Integral was calculated in Section 4, Chapter V, Part II 
[formulas (4.21) and (4.23)]. It is 


j _ 2p -4- s 4-1 __ n _ 

(s-l)s(s+l) 2(2/ + l)/(/+1) 

Substitution of / in (12.14) yields 

A £ — e V a 

2 a nH(l+ 1) 

that is, the previous result (12.10). 


(12.16) 


(12.17) 
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13. The Zeeman effect. Statement of the problem 

The energy levels for a central field, as we have seen, depend 
only on the quantum numbers n and k. The third quantum num¬ 
ber m does not enter into the fine structure formula ( 12 . 8 ), so 
that one level can correspond to states with different values of m. 

But if we place the atom in a magnetic field, each level splits 
into several levels that differ by the value of m. This is what is 
called the Zeeman effect. Schrodinger’s theory proves insufficient 
to explain this effect. But Dirac’s theory, as we will now see, pro¬ 
vides a full explanation of this effect and one that corresponds to 
experimental data in every respect. 

In Section 5, Part III, we examined a generalization of the 
Schrodinger equation for the case of a magnetic field, namely, 
the Pauli equation. But the Pauli equation disregards relativistic 
corrections. At the same time the splitting of levels in a magnetic 
field is, generally speaking, of the same order of magnitude as 
these corrections. Hence both must be considered simultaneously. 
For this reason to explain the Zeeman effect we must examine the 
Dirac equation, which takes into account both the theory of rel¬ 
ativity and the magnetic field. 

We assume that we have a constant magnetic field 26 directed 
along the z axis. As we know, the vector potential of this field 
will be 

A x = -\\H\y, A„ = ±\3K\x, A z = 0 (13.1) 

and the generalized component of the vector potential correspond¬ 
ing to angle <p can be found by the formula 

A x dx + A u dy = A v dtp 
or 

^\X\(x dy — y dx) = ~\HW dtp 

whence 

A v — ^\3(\p 2 — ^\ X\r 2 sin 20 (13.2) 

The additional term we must add to the Hamiltonian in the 
absence of a field, or the zeroth-order Hamiltonian 

= cp a (-7 Pe + T^fe p * + + mc 2 P* + U ( r ) 

is found according to the general rule by replacing with 

p <t — P<t + 7 \ 


(13.3) 
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We denote this term by R. It will be 

J? = -j e l^nPa s 2rsin0 (13.4) 

To find the correction energy due to this term we must apply 
perturbation theory and calculate the matrix elements of R that 
correspond to the various transitions. As we saw in Section 7, 
Chapter II, Part II, the main role is played by the matrix ele¬ 
ments that correspond to the transitions between levels that are 
very close to each other, that is, between the levels of one doublet 
in our case. Hence we will consider none but these transitions 
and put 

n' = n (13.5) 

k'(k'-l) = k(k-\) (13.5*) 

As for the quantum number m, operator R commutes with the 
operator of differentiation with respect to <p, that is, with J( z , 
whose eigenvalues are (m -f l l 2 )h- For this reason (see the second 
derivation of the selection rule in Section 10) the matrix for R 
will be diagonal in m, so that we must put 

m' = m (13.6) 

Therefore we ask for the following matrix elements: 

(k\R\k), (A|/?|-fc + l) 

(-*+l|/m), (-* + l|tf|-* + l) (13.7) 

(for brevity we omit the quantum numbers n and m). 

The total Hamiltonian is 

H = H* + R 

and the eigenvalue equation will be 

(//* + #)*>=» IP* (13.8) 

Using the method outlined in Section 7, Chapter II, Part II, we 
seek the approximate eigenfunction as 

* = c 2 C* + , (13.9) 

where ** are the eigenfunctions of the zeroth-order Hamiltonian, 
H*. We substitute (13.9) into (13.8), premultiply by ** and inte¬ 
grate. We also multiply the result of the above substitution into 
*L*+i and integrate. This gives us two equations 

[W k + (k\R\k)]c i + (k\R\-k+l)c 2 = Wc l 
(-k+l\R\k) Cl + [W. k+l + (- k+l\R\-k+l)] Ci = Wc 2 (13.10) 
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with Wk the unperturbed energy level corresponding to a given 
value of k. We nullify the system determinant of (13.10), which 
is constructed from the coefficients of the unknowns C\ and c 2 . 
This yields for W a quadratic equation whose roots are 

W=+(W k + W_ k+l + R k + R- k+l ) 

± y[(W k - W - k+ , + R k - R_ k+l )* + 41 (— * + 11 R ! k) pf’ (13.11) 
where for brevity we put 

/?* = (*!/? | ft) (13.12) 

Equation (13.11) gives the corrected values of the energy level. 


14. Calculation of the perturbation matrix 

Now we turn to computing the matrix elements (13.7). Using 
the value of a 2 = paS 2 given by (4.29), Chapter I, and the expres¬ 
sion (4.20) for the function $*, we get 

(efc | /? | k') 

= T el X1 $ $ $ r sin 0 W “ fs') ( Z y ' ~ yz ') dr dQ d( f (14.1) 


Let us express f and g in terms of fi and f 2 with the help of (6.2): 


*_ fi + ift 

n/2 ’ 


g = 


f, - tfs 


(14.2) 


and consider f i and f 2 real. As for the functions Y (0, qp) and 
Z{ 0, <p) of 0 and q>, we express them as 

y(9 ’ <P)== '(4nF e,,m+,/ ’ )M ( 9 ) 

Z(9 > <P ) = -^ r e ,(m+,/ * )<p B (8) (14.3) 

in terms of functions /4(0) and £(0), which depend only on one 
angle, 0 [see Section 3, Part III]. Since at m — m' the integrand 
in (14.1) does not depend on <p, integration with respect to <p 
amounts to multiplying by 2n. We get 

oo oo 

(k\R\k') = -\e\X\\r(ff 2 + f 2 f[)dr \ sin Q(BA'+AB')dQ (14.4) 
o o 

We will show that the left integral can be found approximately 
without solving the radial equations. According to (6.3) these 
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equations have the form 

dfx k , - mc s - W + U , 

dr 2 ' l he 12 

dh , k t _ —mc* + W—U f /i a c\ 

~dF + T' 2 = - Fc -'< ( 14 - 5 > 

In Section 7 we saw that the radial functions of a discrete spec¬ 
trum fall off at infinity exponentially and vanish at r — 0. For 
this reason we have the identity 


lihK + hQ*'- 
0 



(W + W* 



(14.6) 


On the right-hand side we replace the derivatives with their 
expressions given by differential equations (14.5) and we get 

oo oo 

5 (j,r, + f,id *=-(*+»') $ (i,n - «) * 

0 0 


+ 


oo 


2mc f 

h ) 

o 


r (fj' 2 + f 2 f[) dr 


oo 



0 


(14.7) 


The second term on the right-hand side is the sought integral 
that enters into (14.4). We note now that f 2 is very small com¬ 
pared with f i and that for the values of k and k' we are consider¬ 
ing, for which k(k — 1) = k'(k' — 1), the difference W' — W (the 
doublet separation) is very small compared with 2 me 2 , and f[ 
differs very little from f\ [both approximately satisfy the same 
equation (6.7)]. If we take into account the normalization of the 
functions, we get from (14.7) the approximate formula for the 
integral with respect to r in (14.4): 

oo 

S ' (Jif 2 + tt'x) dr = ^(k + k'+ 1) (14.8) 

o 


If we introduce the Larmor frequency 


©, 


e\X\ 
2 me 


(14.9) 


23—2186 
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and insert (14.8) into (14.4), we get 

JX 

(k | R | k') = - | &© L (k -f k' + 1) $ sin 0 (BA' + AB') dQ (14.10) 

o 


Now we have only to compute the integral 

Jl 

/= J sin Q(BA' + AB')dQ (14.11) 

o 

Integrating by parts and using Eq. (3.5), Part III, for the func¬ 
tions A and B, we get 

n 

/ = J cos 0 -jjg- ( BA' + AB') dQ 
0 

n 

= - (k + k') 5 cos 0 (AA' - BB') dQ (14.12) 

o 

We multiply (14.11) by k + k' and add the result to (14.12). 
This yields 

(k + k' + 1) / = - (k + k') Re Q (A + IB) (A' + iB') dQ 


) 


(14.13) 


Now according to the formula 

A + iB — (sin 0) v * e~ m [y x -f iy 2 ) (14.14) 

we introduce the functions y\ and 1 / 2 . used earlier in Pauli’s 
theory [see (3.13), Part III]. Then 

n 

(k + k'+l )J=-(k + k')\ (y x y[ - y 2 y' 2 ) sin 0 dQ (14.13*) 

0 

so that 

Jt 

(k\R\k') = j K (k + k') J (y x y\- y 2 y' 2 ) sin 0 dQ (14.15) 

0 

Owing to the orthogonality of the spherical harmonics P\ m , which 
can be used to express y\ and y 2 , the integral (14.15) differs from 
zero only when l' = /, that is, when (13.5) holds. Hence the mat¬ 
rix elements (13.7) are not only the most important for calculat¬ 
ing corrections but the only ones that differ from zero (if n'—n). 

To evaluate the integral (14.15) it is sufficient to express y\ 
and t/ 2 according to (3.24), Part III, in terms of ordinary spheri- 
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cal harmonics and the normalization condition for the latter. As 
a result we get 


(*l /?U') = j/K 


k+k' ( 
12 * — 111 


(k + m) (k' + m) 

[ l(* + m) (*' + m)|l' A 


— [\(k — m—l)(k , — m—l)\]' l ‘} (14.16) 


Giving k' the values k and — k-\-\, we obtain 

= + (14.17) 

(*|/il-*+1,-*% 1 <* + "^zn~‘ )l * 04.18) 

Replacing k by — k + 1 in (14.17) yields 

R_ fc+ , = (-fe + l|R|-ft+l) = A<D Ll ^l-(/n + |) (14.19) 

We have now computed all the matrix elements in (13.7). 


15. Splitting of energy levels in a magnetic field 

To find the shifted energy levels we need only insert the expres¬ 
sions found for the matrix elements of R into formula (13.11). 
For brevity we denote the half-sum of the spectral terms in a 
doublet by Wo: 

±(W k + W_ k+l ) = W 0 (15.1) 

and put 

W k -W- k+l = &W (15.2 

We insert these expressions along with (14.17), (14.18), and 
(14.19) into (13.11) and get 

W = W Q +h<i> L (m + '/ 2 ) 

± y ((AtF) 2 + 2 &Wh«> L + ^0)2) v ‘ (15.3) 

This formula fully describes the Zeeman effect. 

When the magnetic field is weak, so that han. is small compared 
with the doublet separation AIF, we can extract the square root 
approximately by dropping The result will be two levels: 

W' = W 0 + \tiW + h<» L (m + ±) T ±- ( 15 . 4 ) 

W" = W 0 -±AW + h<o L (m + 1) £=£ (15.4*) 

23 * 
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or, if instead of W 0 and AW we insert their values (15.1) 
and (15.2), 

W' = W* + K (m + \) (15.5) 

W" = W_* +1 + Acd l (m +1) (15.5*) 

In a magnetic field each spectral term splits into 2\k\ separate 
terms corresponding to 

- -1*|, -|*|+1, .... 1*1-1 (15.6) 

The distance between two adjacent terms is 

/ WlT=yj- = H.£ < I5 - 7 > 

where 


is what is known as the LancU factor (also called the g-f actor). 
Since it is always positive, it can be represented as 


g 1*1 -- Z + 'A 

I * — */, j / + '/. 


(15.9) 


where / and l are defined as usual. For different spectral terms, 

corresponding to k — 1, —1, 2, —2.the Lande factor takes 

on the following values: 

k l I term g 

1 0 >/s S 2 

-1 1 */« P H , Vs 

2 1 V* P., t Vs 

-2 2 »/a />./, Vs 

3 2 •/* % «/* 

-3 3 Vs Vr 


The case just considered is referred to as the anomalous Zeeman 
effect. 

We shall now turn to the normal Zeeman effect, to be found in 
strong magnetic fields. When a field is so strong that /t ©l is great 
compared with AW, we can extract the square root approximately 
by ignoring the square of AW. We then get two levels 

W* - W 0 + ft© L (m + 1) + AW 

W** = W 0 + /l(vn~AW|^L 


(15.10) 

(15.11) 
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or, if we ignore AW as well, 

W*^W 0 +hto L (m+l) (15.12) 

IP* = W 0 + tUa L m (15.13) 


In this case the distance between the components of the Zeeman 
multiplet no longer depends on quantum number k and is equal 
to /Igol. Thus, when the magnetic field gets stronger, several com¬ 
ponents corresponding to one value of m but to different k's 
merge to form one. This is the transition from the anomalous 
Zeeman effect to the normal Zeeman effect. 

If AW' > 0, then when the field gets stronger the spectral 
term W becomes W * and the term W" becomes IP*. But if 
AW < 0, then W' becomes W ** and W" becomes W *. Since the 
square root in (15.3) keeps its sign when fto) L changes, the two 
spectral terms do not intersect when the magnetic field changes. 

Experimental data confirm this in every detail. In fact, the 
formulas derived here were first found empirically. 

The Zeeman effect makes it possible to compare with experi¬ 
ment the relative intensities of the lines that correspond to the 
transitions between spectral terms with given k and k' and differ¬ 
ent values of m and m'. These intensities can be calculated with¬ 
out a knowledge of radial functions. Indeed, in expressions of 
the type (9.6) for elements of Heisenberg’s matrices the factor 
r D (nk\n'k') does not depend on m or m'. For this reason, accord¬ 
ing to (3.16), Chapter III, Part II, the intensities will be propor¬ 
tional to the quantities 

/ = [ I (km | sin 6 cos <p| k'm') P 

4-1 (km 1 sin 0 sin <p| k'm') P +1 (km 1 cos 01 k'm') p] (15.14) 


Using the table In Section 9 and the formulas (9.15), (9.15*), 
and (9.15**), for example we get for the value k' = k + 1 and 
for the cases m' — m — 1 , m' — m, and m' — m + 1 the follow¬ 
ing values of /: 


I- 

Io 


1 (ft — m)(k — m+ 1) 

2 (2k + 1) J 

(ft — m) (ft + m + 1 ) 

— (2k + l) a ’ 


m' — m — 1 


m' = m 


I 1 (k + m+\)(k + m + 2) 
‘+ ~~ 2 (2k + l) a 


m' — m + 1 


(15.15) 

(15.16) 

(15.17) 


Here / 0 gives the intensity of light polarized in the direction of 
the magnetic field, and /_ and 1+ give the intensity of light po¬ 
larized in the plane perpendicular to this direction. The sum of 
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these quantities 

/- + / °+/+ = Att 

does not depend on m. 

We note that the factor rc>{nk\n'k') primarily depends only on 
the quantum numbers l and l' and is approximately equal to the 
corresponding quantity r(nl-,n'l') in Schrodinger’s theory [see 
(9.14), Chapter IV, Part II], so that its value is almost the same 
for the two components of a doublet. This makes it possible to 
compare the intensities of the Zeeman components belonging to 
different components of the doublet. 




Chapter III 


ON THE THEORY OF POSITRONS 


I. Charge conjugation 

In Chapter I of Dirac’s theory (more exactly, in Section 5) we 
noted that it is possible to select the matrices in such a way that 
the system of four equations for the components of the wave func¬ 
tion of a free electron will have real coefficients [Eqs. (5.9)]. 
According to (5.8), Chapter I, the corresponding Dirac matrices 
will be 

a ? = ° r 1 , «2 = P 2 ct 2 * a 3 = a v a < = P 3 a 2 0-1) 

where oi, a 2 , a 3 and pi, p 2 , P 3 are the Dirac matrices (3.18) and 
(3.19), Chapter I. The matrix elements of a®, a®, a® are real, and 
the matrix elements of a® are pure imaginary. 

Let us write the Dirac equation for an electron in an electro¬ 
magnetic field that corresponds to this choice of matrices. We 
have 

= ( 1 . 2 ) 

For stationary states, when the wave function depends on time 
via the factor e~ iwt/h , we get 

tfi|>®=lFi|>® (1.3) 

where 

Hi>° — a® ( — ihc + eA x ^ -f a® ( — /Ac + eA y ^ 

+ a® ( — ihc + eA z ty oS ) -f- mc 2 a®t|>® — e®a {> 0 = (1.4) 

Now we write the equations that are complex conjugate to (1.4), 
and let us change the sign on both sides. We get 

°? (- ihc inr ~ eA &) + a ° (~ ihc 4 ?~ ~ eA v *°) 

+ a® (— ihc - eA z ^ + mc 2 a°ip® + c®\J)® = — W ip® (1.5) 

These equations differ from the previous (aside from the substitu¬ 
tion of t|)° for if®) only in the different signs of the electron charge 
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—e and the energy W. Hence the quantity that is complex conju¬ 
gate to the wave function of a particle with a negative charge and 
negative energy (that is, to the wave function of an electron with 
negative energy) can in a certain sense be associated with the 
wave function of a particle that has a positive charge and positive 
energy (that is, with the wave function of a positron in a state 
with positive energy). This is not a direct association, however, 
and cannot be interpreted in the context of the one-body problem. 
Such an interpretation would conflict with the fundamental con¬ 
cepts of quantum mechanics. But it does clear the way for an 
interpretation of the mysterious states of the electron with nega¬ 
tive kinetic energy and the related second internal degree of free¬ 
dom of the electron, which we noted in Section 12, Chapter I. 

2. Basic ideas of positron theory 

The physical interpretation of the second internal degree of 
freedom of the electron is based on the study of the problem of a 
physical system that consists of a variable number of charged 
particles but in which the total charge does not change (the law 
of conservation of the total charge). 

The mathematical statement of this problem requires introduc¬ 
ing operators of a new kind, namely, operators that act not on 
the wave function of a fixed number of variables (corresponding 
to the number of particles) but on a sequence of wave functions 
that depend on the variables for one, two, three, etc. particles. 
These new operators convert each of these functions into a func¬ 
tion of variables for a number of particles greater by one (the 
creation operator) or less by one (the annihilation operator ). 

These operators of a new kind are a formal generalization of 
the one-particle wave function. The creation operators generalize 
the wave function itself, and the annihilation operators generalize 
the quantity complex conjugate to it. 

Operators that represent a formal generalization of the wave 
function are sometimes referred to as the quantized wave func¬ 
tion, and the transition from an ordinary wave function to a 
quantized is known as second quantization. Second quantization is 
also applied to systems consisting of an indefinite number of 
uncharged particles — light quanta, or photons. This is the sub¬ 
ject of quantum electrodynamics. 

The theory of second quantization will not be considered in this 
book. It can be found in the author’s Studies in Quantum Field 
Theory. 3 


8 V. A. Fock, Studies in Quantum Field Theory, Leningrad University Press, 
Leningrad, 1957 (in Russian). (The book contains articles that were first pub¬ 
lished in 1928-37.) 
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3. Positrons as unfilled states 

In conclusion, a few words about the model proposed by Dirac 
for positrons as unfilled states ( vacancies , or holes) in some ficti¬ 
tious “sea” (sometimes called the Dirac sea) of electrons in states 
with negative kinetic (and total) energy. The electric charge of 
this “sea” is somehow neutralized. 

In a certain sense the model of a “sea” of filled negative-energy 
states is a generalization (or extrapolation) of the concept of a 
filled electron shell in an atom, whose negative charge is neutral¬ 
ized by the positive charge of the atomic nucleus. The lack of one 
electron in the atomic shell is expressed in the existence of one 
positive charge equal in absolute value to the electron charge and 
constitutes a kind of analogue or model of the positron. 

However, application of the concept of a “sea” to the theory of 
positrons has logical faults. First, it is not understandable why 
this “sea” proves to be neutralized. Hence the initial concept of a 
“neutralized sea” is nonphysical. Second, the concept of a full 
set of states (which the Dirac sea is supposed to be) has mathe¬ 
matical meaning only when these states are discrete. In the case 
of a continuous spectrum a full set of states cannot be distin¬ 
guished from a partially filled set, and the initial concept of the 
theory loses mathematical meaning. 

Therefore we must recognize that there is still no closed, logi¬ 
cally faultless theory of positrons. Construction of such a theory 
will probably require essentially new physical concepts in addi¬ 
tion to those used in ordinary quantum mechanics. 

For this reason this book did not set out to give a full picture 
of the present state of the theory of positrons but confined itself 
to general remarks about the fundamentals of the theory. 
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